Diversity matters in digital scholarly technology – A conversation with Mark Hahnel

Published by Access 2 Perspectives on

mark-hahnel_conversation

Mark Hahnel is the CEO and founder of Figshare, which he created whilst completing his PhD in stem cell biology at Imperial College London. Figshare currently provides research data infrastructure for institutions, publishers and funders globally. He is passionate about open science and the potential it has to revolutionize the research community. For the last eight years, Mark has been leading the development of research data infrastructure, with the core aim of reusable and interoperable academic data. Mark sits on the board of DataCite and the advisory board for the Directory of Open Access Journals (DOAJ). He was on the judging panel for the National Institutes of Health (NIH), Wellcome Trust Open Science prize and acted as an advisor for the Springer Nature master classes.

ORCID iD: 0000-0003-4741-0309 

Website: figshare.com 

Linkedin: /mark-hahnel

Twitter: @MarkHahnel 

Which researcher – dead or alive – do you find inspiring? My old PI Sara Rankin, just for the sheer range of her interests and achievements. She is a Professor of Leukocyte and Stem Cell Biology Imperial College London, a science/art collaborator,  and a more recent leader in the UK neurodiversity movement.

Reference

NIH issues a seismic mandate: share data publicly (Feb 2022), https://www.nature.com/articles/d41586-022-00402-1

Science, Digital; Simons, Natasha; Goodey, Greg; Hardeman, Megan; Clare, Connie; Gonzales, Sara; et al. (2021): The State of Open Data 2021. Digital Science. Report. https://doi.org/10.6084/m9.figshare.17061347.v1

TRANSCRIPT

Jo: Welcome back to Access to prospective conversations. Here with me today is Mark Hahnel from Figshare. Welcome, Mark. 

Mark: Thank you for having me, Jo. 

Jo: It’s a great pleasure. So Figshare just turned ten years of age or operations being an operation on the market and of service to the scholarly community. Yeah. Let’s jump right in. What’s your I don’t know, maybe three or five takeaways from the past ten years and then moving on, we can take a look into the future. So how has it been in the past decade of Figshare? 

Mark Hahnel: Yeah. So it is ten. It was ten in January and it feels like not long at all, but at the same time, a lifetime. And I think a lot has changed in the concept of…Figshare started because I needed a place to make some of my files available to get credit for them when publishers wouldn’t accept them because the files were too big. And we’re talking about five megabytes of video files. So I think the Web has moved on in the way that we’d expect with cloud computing and just a lot more storage and a lot more capacity for viewing different types of academic content on the Web. But really, when Figshare started, I started out of my PhD, and I did plan that Open Data is still a very new concept. Let’s give it a year. And if it’s going well, I’ll continue. But after a year, if it’s not going well, I’ll just come back and do a postdoc. And so a decade later, I think the field of stem cell biology has moved on so much that it wouldn’t be a long time ago that I’d be accepted back into a postdoc. So I suppose on that kind of thought piece of where Open Data is in the global psyche with regards to importance in academia and just this Open Research ethos in general, I think it’s obviously moved on a lot so that things can be sustainable. We were just awarded a grant to work with other generalist repositories like Dryad and Central Open Science from the NIH. So that’s a great point if we’ve gone from “will this be able to be a thing that people care about enough for a repository to sustain itself to the NIH Mandating Open Data in January 2023 for all of the researchers they fund and the associated support that is needed around that.” So I know you asked for three. I think that it’s a good thing that in the last ten years things have moved on so much that we’re still here. But in turning ten, I’ve really been thinking a lot about the last ten years. We’re encouraging researchers to put files on the Internet and make files openly available. They previously wouldn’t. And the next ten years is really about how can we use them for good? How can we make them more useful so that we achieve the long term goal of just making academia more equitable more efficient and research work in a way that isn’t limited by waiting for papers to come out. How fast good publishing is the thing I think about a lot these days.

Jo: I just came out of a conversation before this one where we touched upon data maintenance, especially for repositories. We also think about that quite a bit. There’s no urgency for that as yet with AfricArchive because we are still building and advocating for encouraging, encouraging African scholars to share their work. Not that they don’t want to, but there’s just a lot of barriers to work through, especially in that part of the world. But now that we’re encouraging from one more data sharing online, which is certainly good for reproducibility, accountability and transparency and research, how about data maintenance? How about cleaning up our closest once in a while and ensuring that we don’t use storage space and therefore eventually CO2 emissions for no good reasons? After all, is there a level that you guys at Figshare have discovered of quality assurance? As much as we want to preserve accountability, like many repositories put it in this perspective, but we also struggling with Africa. There’s a level of accountability that’s needed, like what’s online should stay online, like theoretically for good. But what does for good mean on a human scale or on time measurements? Also in storage capacity? Really, it’s easier said than done on the digital age because digital files decay as well. So that’s first the technical issues, but then if we accumulate bias on mass, can we still make sense of it at a certain point? And then we’re calling for artificial intelligence. But there’s also already a lot of studies on how reliable, they can be human made, they come with biases.. . And also there’s a critical amount of carbon emissions that comes with the storage, which is now in the double digits. If you compare to other things that are so forth, it’s probably a little bit of opening Pandora’s box. But is this something that picture or that you also miss or are discussing or considering?

Mark Hahnel: Yeah. So I know like parent organization, Digital Science has goals around carbon neutrality and things like this. So it is an actively spoke about thing. It’s kind of a little bit bipolar in the space, in some of the ideas in that still a lot of my colleagues, you say you have to make your data sets available; you have to make your papers open access and all of these things. And they’ll happily pay $12,000 to get published in nature if they have the funding available. But they’ll complain about $1,000 to applause or scientific reports.  Because that is indoctrinated into the system. And so when we talk about data, I still have close friends who say I’m not making my data available until I’m forced to because other people scoop me and all of these things I’d say in ten years of open data and 5 million open files on fixture, we see that this isn’t the problem that people were thinking it would be in the same way that my friend, I’ve seen people take photos of Sunsets on Instagram and then put copyrights on them. And while that may be a concern, I don’t think people are going there are enough openly available photos of Sunsets to not go and steal your proprietary artwork. But it is built into the system. And so if we think about it, when I say bipolar, we want to encourage more people to make more of their content openly available. At the same time, there is a cost basis to it, and there is a climate impact to it. And I think there’s other things on that level that people store as much content on mobile phones and things like this with photographs that they’re never going to cleanout. But the one thing I think is really interesting when you mentioned preprints and data is you have the traditional publication system, which we’re all familiar with, then you have preprints. And I think the big battle there is fast but good publishing. How do you make sure that the scandalous newspapers and journalists of the world don’t interpret things in scrupulous ways, as we’ve seen with COVID, and then the data space has that problem in that there’s no peer review. And I don’t think there’ll ever be peer review of data, because what you’re asking is, is this data sound? Has it been well described? Is it obviously collected correctly? Is this novel data right? It doesn’t require three of your peers. It just requires somebody who knows what they’re talking about to make sure it’s well described. So it’s more like an editorial check than it is peer review in my mind. But it still has this problem of making sure you don’t want to have it interpreted wrongly. And then you also have the problem that it’s not a single unit. So a paper publication in nature is six to 20 pages PDF. A preprint is 6 to 20 pages PDF. Data sets could be I have 100 space images, or which is three Petabytes, or I have one model which is five megabytes, and I only need one DUI for this, and I need 3000 DUIs for this. And you don’t want a limitation to be on reuse. You want to encourage as much as this is another bipolar thing. You want to encourage as much reuse as possible. But then the more reuse there is, the more costs there are all of these things. It’s not so much the storing of data, it’s the moving of data that’s the problem. So in the future, I think bringing the compute to the files will solve a lot of these problems. But for now, how do we encourage people to reuse as much as possible while also being aware that that would involve 10 million downloads of a file all around the world? Right? 

Jo: I think it’s also a matter of counting all costs that are implied. Like I have a dog on my left, for those who see the video and thinking about animal research or research on animals if data is made available, I’m sure there would be less need for much of that. I’m not saying there shouldn’t be any. I mean, I wish there were people, but I understand that there are good reasons to have some animal research, but it’s another discussion to happen in an ethics community or somewhere else over happy to have a discussion somewhere else at another occasion, but not only for animal welfare purposes. Wellbeing, like how many PhD students had to repeat experiments that have already proven not being possible to perform or providing or generating the results that you expect thousands of times because the data has not been made available or open access. So that’s the cost like mental wellbeing of the researchers who have to repeat stupid experiments because there is no platform, not enough sharing culture to access information that’s already available somewhere, but also the environmental impact. Also, what do we see as a cost versus the benefit of doing the work in the first place that needs to be assessed and talking about maybe for a minute on the work that you contribute to with the Open Data report that’s coming out annually and the most recent ones, did you see any development in how over the past three to five years, how data is increasingly being shared through the policies and fair data principles that come into play? I’m sure there are improvements. And I also assume some of your presentations of that report. So there are some success stories to share there, right. And still some more work to be done. But yeah, just share some of that. 

Mark Hahnel: So the way you phrase that is a really interesting angle as well, because I’m quite the optimist. So I tend to focus on the positives. And it’s always crazy looking back at ten years and thinking, wow, that’s amazing. Look how far, as I said, 5 million files that weren’t publicly, openly available and now openly available. Anyone can publish anything. Back when we started, there wasn’t a place where you could publish stuff like that for free, which is why we started. Now you do have things like Zenodo and you do have preprint servers and this is a newer thing. There was Archive for physics, but that’s about it. If I think about what has happened and what the potential for moving the space on and making academia more efficient, I think I see it as the example I’ve used in one of those talks is the deep mind making use of protein data bank and having the protein folding problem that was a costly 50 year endeavor just sold overnight. They went from 30% of all human protein structures accurately predicted to 99%. So that’s a shift change in that field that couldn’t have happened without Open data. Well, described, and how do we do that for all data? So that’s why I focused as an optimist. But this is a big but I remember in the early days of Figshare being invited to talk at a conference, specifically on how open data can help reduce the amount of animals in research for the reasons you explained, people doing the same experiment over and over again. And I would say that that is something that hasn’t moved along at some level. But if you find a paper and see someone’s done an experiment, you can get the data behind it. But the idea of negative research being made openly available is not happening. The largest driver for people making data openly available is that they have something they want to share with the world and that they’ve got some really cool file formats and it doesn’t translate it in a publication. Well, or they are being told to as a part of their funding or as a part of their or they’re publishing a paper and they say you need to make the data available. And so the vast majority of if anyone can just make any of their data openly available, will they make all this negative data openly available on I tried this, I had a good idea. I tried this, and it turned out my hypothesis wasn’t correct. But here’s the data anyway. The incentive structure is not there for busy researchers to make their data available on that level. So there’s still a lot of wastage there even the NIH’s policy, which is we’ve had a lot of different funder policies over the years from South Africa, China, but Western Europe. And so the one I’m focusing on in the NIH being next year, is the largest medical research funder on the planet. It’s still, if we fund you at the point of your publication, make your data available. That’s the mandate. So that’s not publish the result of every experiment you’ve done. And it might be that they are, they can’t just jump straight in and drink people towards this. But, yeah, maybe if I don’t just focus on the positives. I see there’s lots of work that can be done with reducing, improving efficiencies when it comes to things that are important, like animal research. 

Jo: On that note. Well, maybe light at the end of the tunnel, which…

from one German Institute, they’ve developed a database where any researcher from around the world can register animal research experiments before they get started, to have them assessed for regularity. And while planning to reduce the number or reduce the price, refine, like the three R principle research that’s hopefully becoming better known. So there is, like, at least this, and I think in the UK there’s also registry of a similar sort. I agree. Like the initiatives and services on that angle are still scarce, so there’s quite a bit of work to be done. 

Mark Hahnel: I think there’s a good point there around broad strokes, like broad strokes are narrow focus. And I think for problems with a specific use case like that, the idea of it might be a high tide lifts all ships. So if there’s everyone making more data available, there is going to be more efficiency or making their papers openly available so everyone can read them, there is going to be more efficiency. But at the same time, it’s like the moonshot missions. They wanted to land on the moon and they made it happen. If you want to reduce animal research inefficiencies, then you come up with a solution like that. And I think this is also where we’ll see more work. Joe Biden has the cancer moonshot in America, which revolves around open data, around cancer. With a mission. And I think sustainable development goals, climate change. I think if you apply each of those contexts to open research and open data, then you stand. That’s how we’ll focus in some of this return on investments, basically for making data openly available over the years. 

Jo: So that’s one question we had in one of your presentations. Do you believe that open science and open data can save the world? Looking at the pandemic, looking at what we’re now dealing with in the Ukraine and Russia, and organic climate change, we believe that we as researchers can foster open data and fair data in a way that also allows other societal stakeholders to work with us scholars to change, for the better, to do better than we’ve done in the past. 

Mark Hahnel: Yeah. It’s been an awful way to find out some of these things. So covet is just obviously it’s had tragic consequences for a lot of people. But as a humanity experiment, it’s proven some things that people may have questioned before and so on. The negative people will believe stuff that they want to believe in minorities. So there are small subsets of people. You can show them that this ball is red and they will believe whatever they want to believe that fits their agenda. But for the vast majority of people, they do care about getting the evidence and understanding what’s happening. And I think we’ve seen this in the past with open access to papers where people get ill or their family members get ill and they want to read up and it can’t access any of the content. And they haven’t heard of SciHub because they’re not in academia. So there’s problems that we see on that level. And then you can see just the sheer amount of literacy around understanding and interpreting data from COVID has been that everybody understands now that this should be the norm. If you have a statement, you need to back it up, particularly around health care concerns. You need to back it up with the data. You didn’t have Donald Trump saying, I won the election. And when people are saying, well, can we see the results of the counts. You don’t need to see it. I won the election. But that’s what we have in academia, right. We have people publishing papers and data available upon request and it’s never been made. It will never be made available. But I’ve often thought about asking for retractions of those papers, but I think I’d be hunted in the street if I suddenly cause the retraction of a bunch of papers. So it’s a balance, really. 

Jo: One of my favorite topics is global research equity. Well, I’m using that term in the sense also from where we’re coming from, at Africarchive to create global scholarly community, which some would argue we already have. But I don’t see that happening as much. I’m not blaming English only as presumably being the lingua franca, we look at research, our scholarship and text, and that being counted. English is probably predominant in the databases that we in Western Hemisphere are familiar with. But then there’s also quite a large body being published in Russian, in Mandarin, in Portuguese. There’s a huge holiday community in Brazil and Portugal and some African countries as well, picture is widely adopted around the world. So how do you see global research equity come into play for the work that you do with your team? And how do you feel being adopted in not the usual suspect countries of this world? 

Mark Hahnel: Yeah. So I think there is huge opportunity, but there are also limitations. And I think if we’re looking at the types of content that we’re talking about, Figshare came out as the data research files associated with a research paper kind of space, and we now deal in paper repositories, thesis repositories, free prints and all of those things. But from a pure data point of view, from this new era of data publishing, I think what’s great is that; we’re no longer a startup. I feel after ten years you can’t really quit yourself.

But the way we came into the world working with digital science, it was very much working with other startups in the space. And a lot of people had this idea if academic publishing was invented today, this is what it would look like and it’s true. But academic publishing wasn’t invented today. So a lot of your assumptions are therefore invalid in the tools that you’re making. If you’re saying, well, if academic publishing was invented today, everyone would just be like everybody has a lab Wiki and they just put their information on there. It’s like true, but it’s not going to work because people want to get published in high impact factor journals. There’s a lot of work in that space from DORA and those folks and other folks trying to reduce that down. And I think universities have a role to play, funders have a role to play with that. But to try and shift that mindset is going to happen one funeral at time. As they used to say, it’s hard to move past. But whereas in the data space you don’t have that. So you have research organizations not relying on necessarily on publishers and taking back their own control over it funders governments taking control over their own data. So that’s great for Equitability, where the Equitability may fall down. Is there’s a great term fair data findable accessible, interoperable and reusable data, and you can do a bit of that using machines, right. So the repositories we provide are fair supported. If you have fair data and put it in a fixed share repository, it will be fair. But we can take all of the technical boxes, but you need the human element of it. When you say findable, findable is great, if you give it a great detailed title and you can nudge people and say data set is not a great title, maybe change the title, but people ignore it. Whereas if you have the education level and the curation level and the librarianship level, which isn’t equitable around the world and it isn’t equitable within each country, because you’ll always have the big well-funded organizations that can provide more support and the lower, less funded organizations that have lower support and therefore their researchers might not be describing their content well, therefore it doesn’t get found as much, therefore it doesn’t get reused as much. And if reuse is the metric to drive your career, then you’re going to have this rich get richer, poor get poorer kind of mentality. So I think that is one thing that the funders internationally and within each country should be looking at is how do we make sure that for new methods of research communication, there is equitability in support? 

Jo: Yeah, I agree. As a trained biologist I’m also coming from a research environment. That was our President who was looking into evolution and know how to appreciate diversity not only in natural ecosystems but also in technical ecosystem. And that’s what we’re trying to foster with Africarchive.

We’ve had discussions around or also comparisons for profit/nonprofit. Not only that, but also the digital infrastructure has been developed in Silos for good reasons. Nothing wrong with that. There’s always a starting point that is very much specific. And then now we live in an era where our service providers are trying to connect the dots. And that’s also what you mentioned earlier that you now work on that. What is it like the project that you mentioned earlier …

Mark Hahnel: with NIH? It’s called the Green Project, and it’s the generalist repository improvement. 

Jo: So how do you position Figshare in the ecosystem in the process of being developed global scholarly infrastructure. And do you see what are the benefits of having a diverse system or a system that’s created and composed of diverse entities? Like I said, for profit, but also a nonprofit and anything in between? I also had a paper together with my friend and colleague. We’re looking into these things. How are each of these repositories and services funded? Is it a single funding source or is it mixed funding? And where is the funding coming from? National original levels. These are all details that can be quite confusing and are also complex and diverse in their nature on various levels. And I think that’s a good thing. So here’s my opinion. What’s yours? 

Mark Hahnel: Yeah, I think it’s a good thing. I think it’s a well-established fact now that a diversity of opinions on things can help. Collaboration across the world is also a good thing for improving outcomes. Projected outcomes. I think the strengths and those weaknesses of everything. When Figshare started, we spoke to a couple of different folks and then getting startup investment, we became a commercial entity. And the core focus immediately was sustainability. And so now as things stand today, we have hundreds of paying clients. And you might say, well, okay, some of them might cancel next year, but the chances of all one, hundreds of them canceling means that we’re quite secure and we can model out how we can grow and how we can hire more people in different places to do different things. And we are a global team now. And we have people, we have employees in Africa, we have employees in America, we have employees in Australia, continental Europe as well. It is useful to have that, and we can see that during COVID we see even more that collaboration can happen. It’s sometimes frustrating. I still come to the office myself, even though not many other people do. But I think if you’re thinking about where the balance is, the thing I always tell people when they’re working with Figshare is, don’t trust me, I could be dead tomorrow. Trust the contracts that you’re signing. Make sure the contracts you’re signing adhere to everything that you’re hoping they adhere to. Make sure you’re avoiding lock in, which is a traditionally difficult thing in the academic space. The flip side to this is if we hadn’t gone that route, we’d have got a grant and we spent all the money and it would be nice. And being an open source repo somewhere like $1000 other repository systems. And so forcing sustainability is a big thing. And I see a lot of organizations, I’m on the data side board, I’m on the advisory committee for DOAJ and see a lot of institutional membership fatigue. If you’re just saying, oh, you pay $10,000 for this reason, it’s difficult because people like, well, there’s 100 people asking, so why, if we start having to prioritize ORCID over this one, then it gets difficult. The question, so sustainability is easier if it’s just upfront and it’s transparent. I always say the aim is to put in more value than you take out. I think there are a lot of economies of scale, but I’m also happy that the space we work in has a diversity of tooling. Because the thing you’d want to avoid, speaking as the CEO of Figshare, the thing we’d want to avoid is Figshare becoming a monopoly in this space. Because then you’re inviting in potential for bad behavior or you’re inviting in systems like that. So I’m delighted that there’s been this Cambrian explosion of other systems that you can use, like Zenoda or Dryer or some of the commercial ones, maybe not so much, but it’s always a difficult balance. And I think the one that on the not for profit side of things, the learning that should be there is to focus on sustainability in the same way as you were going to run out of investment for your startup and then vice versa, learning from data science and folks like that, it’s what is your objective here? What are you setting out to solve the problem of and all your goals aligned with solving that problem? And I think if you can match the two, then both sides can learn from each other. 

Jo: Yeah. That’s also what I believe in. And I’ve seen functioning. So in line with that and what I also offer in the encounter, and I think I was trapped in that myself, if I’m honest to myself, is that for some time I thought, oh, we do this for a greater good and I’m investing. I do this voluntarily or for no payment, not realizing that I’m doing this from a prestigious or luxury position, like having maybe salary or other securities to be able to have that ease of mind to do the work free of charge. But somebody is still paying. And I feel that many nonprofits have a similar mindset and then assume, but not generally speaking, probably towards what you just said, like focus on sustainability. Also, we’re creating valuable services and we employ people who need to get the food on the table, including myself and us. There’s only so much you can do. We all have to eat at the end of the day, no matter where in the world we live.

Yeah. It comes down to a very entrepreneurial or for profit everyone’s approach. Services need to get paid for. The question is, who’s paying? Should it be the researchers? Should it be, again, the public for Texas, now that we create services that are internationally applicable or usable, should it be a nation or an Institute in one city or one Department in an Institute in one country pay for all of that? And these questions are being asked everywhere you look. And there’s also diversity of solutions to that or evolutions that occur. The note that was mentioned that we both mentioned certain Institute made a decision I don’t know how long ago, about a decade as well. I think a little bit more that, yes, we’re going to finance the model forever, and I just seems very generous. But how much longer can I really do that now that it’s been more adopted? But yeah, again, but we also see a diversity of services that are similar and they’re different. We can keep learning from each other. So just pointing those up and I think the beauty with us and also like all of us talking to each other and reflecting on these things allows us also to create these best practices that are needed to provide good services in the long run.

Mark Hahnel:  Yeah, it has been just like speaking frankly, one of the things I found tough over the last ten years is that there’s a lot of moral signaling and good versus evil and commercial in some people’s eyes falls into the evil point of view. It’s difficult because I felt that I’m not able to speak my mind on public matters, on things. And so it’s because of the political situation surrounding it and things like that. So if anything has just forced me to I know I’m talking publicly now, but retract away and I do a lot less hypothesizing on these things now because there’s no benefit. You think you’re doing something for the greater good and you can be committed to that. But if you’re constantly getting stone thrown at you, it’s demoralizing. So you just have to focus on understanding what your plans are and what I was saying about what you learned from the not for profit communities and what are we trying to achieve and what is the North Star. And we have the fixture core principles listed out on our website that you can go and see and we just stick to that and get on with it. And it’s nice to have forced collaboration at a time when sometimes it might start not happening naturally. 

Jo: I think also what I realized at some point is that entrepreneurship or maybe commercialism. I don’t know if commercialism, but looking at entrepreneurship and also running a company in the old days, whenever that was, I think the breaking point was the Second World War and the time since then when it was all about growth, no matter the cost, we have to make more revenue. Why? Well, we don’t know why. Because some people in this company want to get richer and because we are being measured according to growth instead of before, from what I’ve seen in Germany or many European countries, for sure, elsewhere, companies would treat their staff like family. And there were filling gaps that now in society to provide services for the betterment of society, to make life easier, for producing washing powder that would take away some of the work from women in society and to allow them more quality of life. Because now we have a washing machine and I don’t know. I think we’re also in an era with all these prices going around to come back to the sense of that. I’m speaking with biologists limitless growth of cancer at the end of the day and we see that in capitalism, it’s not working in nature. This world is not made for limitless growth and there’s no point in the long run. And there’s too many people suffer along the way? 

Mark Hahnel: Well, I think a lot of people have realized that in covert time. With the amount of people changing jobs or just deciding they don’t want to or don’t need to work. But again, that’s a luxury like you were talking about before. And I think one of the things this is a slight tangential, but one of the things I think about is, as I say, I come to the office because I like people and I like coming to the office. I’m in a hundred-person office with six people today. Right. And I’m in London and Covid is over. But that’s not true global So I’m thinking now about how do we encourage people obviously, people can work from home, and that’s no problem. But if you’re an early career, you’re early in your career, you would benefit, in my opinion, from coming to an office and meeting people and evolving that way. And I think a lot of people will lose stuff from being on just on Zoom meetings because that’s the path of least resistance.. I don’t have to get on a tube and all of this stuff. But again, that’s a place of luxury because there’s a lot of people who won’t be able to go back to the office for years because of COVID right, because of the global disparities between vaccination and things like this. So I think one of the countries is not giving vaccines to one of the other countries because of the Ukrainian invasion and all of these things. I understand it. But at the same time, I think we need to try and if we’re trying to move towards a global narrative of how everybody should be able to achieve anything in academia, there’s going to be curveballs. But I don’t think it’s going to be easy over the next five years just to get on a common footprint with working environments. 

Jo: Maybe we can get as political as we want to and we can also cut it very short. But now I was supposed to about the situation that there was at the onset of the war, there was thousands of Russian scientists speaking out against invasion for peace and clearly blaming, okay, that’s not too political, but there is the opposition, which now we see also in the news is being oppressed actively, and it has been more past years and decades, but now increasingly; so do you believe that open data, open science practices can create a level of academic freedom that can decouple from politics? Can we keep working with these researchers who want to continue doing research and also whether political circumstances in their countries? But how can they do that if we now ban everyone from collaborating? I don’t know. There’s probably no right answer to this. But except for do we currently the situation, what we know today to leave, is there possible? Is there such a thing as economic freedom during wartimes? 

Mark Hahnel: Well, I mean, just as an example, one thing that’s upset me in the past about SciHub is when they cut off access to Russia. Because Russian professors insulted Alexandra Al Qaeda. And I’m like, well, that’s kind of counterintuitive to what SciHub is supposed to appear, isn’t it? It’s supposed to be that anyone can access any content anywhere, 100% illegal for lawyers who are listening. But I found it to be a smack in the face when that happened. That said, I think as long as web infrastructure can persist, then across geographies, then things can continue to work and people can continue to build on top of the research that’s gone before. I think I remember being at Tomsk University in central Russia and seeing their libraries there, hundreds of years old, which are just the edge of Siberia, but their libraries are on a park with University of Oxford and other long term, long standing academic institutions that have been around for hundreds of years and gone through hundreds of wars.  So the difference between then and now is that you have a way to communicate. And as long as those levels of communication can happen, then access to content and access to information, then it’s no problem. And I think this is something we saw with COVID when in the State of Open Data survey that we did annually, is the amount of people who are reusing other people’s data or their own data in a way that they hadn’t done in the past out of necessity, actually was really good for the space in terms of efficiencies. And if you can’t be in the Wetlab, just pushing, pushing, pushing with this moving forward capitalist mentality in that you move forward and don’t look back to see what actually is the more information that I could get out of the work I’ve done. So when you’re forced to do that that you can find. Yeah, I could write a few more papers about my findings here, or I could thoroughly finish off this project before moving on to the next bigger and better thing. So I think that could happen with academia, no problem. As long as people have access to collaboration can continue, as long as people have access to the Web. 

Jo: And that’s again why we are working towards the globally inclusive technical digital infrastructure. Maybe like in the interest of time and coming towards some conclusive remarks, let’s draw a utopia of scholarship. What’s your best case scenario? I think it was obvious, like everybody was equitable access to research, communication was to be able to contribute to it. So maybe that was too easy. What are the next tangible steps we can and should take as researchers, as research service providers, and as a community of scholars? 

Mark Hahnel: Yeah, I think there’s a tendency to think the task is insurmountable because they’re too big and culture is too big to change. But in the same way that Open Access started in the 80s. But it was only last year that we went past 50% of publications being made open access, which is great, though, because it means that more than the majority is now open access. So it took a while, but we got there. And I think that’s true. As I say, looking back on the last ten years and looking forward to the next ten years, I think the things that need to be worked on are the training and education around ways to disseminate scholarly information. The funding of infrastructure, whether it’s grant based or University based or anything else needs to come from all it needs to continue from all fronts in the same way that the training needs to come from all the fronts. The publishers have a requirement there, or the publishers have a reason that they should be doing that, and they are doing that. And you see that from Plaza and Springer Nature. They’ve both done a lot on Taylor and Francis. They’ve all done a lot around data and pushing the needle on what traditional publishing is. And I think it’s easy to attack the publishers because they’re often seen as the cash cows. But at the same time, there are people within those publishers who are doing good. But one of my concerns around that still is that I just mentioned open access went over 50%. But if you compare gold Open Access versus Green Open Access hockey sticking Gold Open Access and Green Open Access is linear, right. In terms of growth. And so I think there really needs to be a conscious push towards better green open access. And this is where I’m excited working in the repository space because we can help that repositories have failed to that level. Right. And this is somebody who is in the repository space and builds repositories. I’m saying we have failed at that so far in that we’ve let gold open access become the easy mode. And I’m not a grant funded researcher, so if I want to publish a paper, what are my options? And I’m, as I say from in a good place and I have a job. But how do you make that equitable for everybody? And I think there is so the continued push on open access publication, particularly green Open access publications, I think training around open data and description of this idea of you will get better rewards if you make your data easy to find, which is a hard one. So that needs to go on globally. And then I think we’ll get to some consensus on this kind of preprints. Fast book, good publishing, right? I think with fast and good. If you pull on one, the other one moves two. The faster it is, the harder it is to check it’s good, the longer it is. But I think the fact that preprints have taken off so much just shows the desire to improve the traditional model, which may have some bits of good, but is a long way from fast and so I think focusing on fast, but good focusing on training around new realms of publishing and pushing for green open access. And ten years from now, we’ll all be building on top of the work that’s gone before us and no animals will be getting home. Then we’ll be still mining that great big database base in the sky. 

Jo: Yes, that looks like what I want to live in. Thank you. Okay. Let’s keep working towards that. Thank you so much for being here for the conversation. Go ahead and speak soon.