Lanfrica – the hub for African language resources – A conversation with Chris Emezue

Published by Ebuka Ezeike on

Chris Emezue is a researcher and the founder of Lanfrica, an online resource centre that catalogues, archives and links African language resources in order to mitigate the difficulty encountered in discovering African works. He shares the vision and achievements of Lanfrica with Jo on this podcast.

To see all episodes, please go to our CONVERSATIONS page.

Chris Emezue’s research areas of interest include machine/deep learning, reinforcement learning (for games and neural machine translation), natural language processing (speech and language processing). Chris is also a Machine Learning Researcher in NLP. As an African NLP researcher, he is working to promote African languages in Speech and Language Processing.

He spends his time between studying, doing research on structure learning/ causal inference at the Mila Quebec AI Institute, ML advocacy at Hugging Face, AfricaNLP research with Masakhane and improving Lanfrica.

Personal profiles

ORCID iD: 0000-0002-3533-6829

Twitter: @ChrisEmezue

Linkedin: /in/chrisemezue


Which researcher – dead or alive – do you find inspiring?: Yoshua Bengio

What is your favorite animal and why?: Dog – they are loyal

Name your (current) favorite song and interpret/group.: Said by Nasty C and Runtown

What is your favorite dish/meal?
Eba and soup (with meat)

Chris and friends. Photo Credit: Chris Emezue


Jo: Welcome back to another episode of Access 2 Perspectives Conversations, and welcome Chris Emezue. It’s a pleasure having you here today. Thanks for joining us. 

Chris: Thank you very much. The pleasure is mine. 

Jo: We met, I think it’s about a year ago now, or longer, through your work with Lanfrica, which we’ll hear more about in this episode and our work with Africarxiv. And then we stumbled about each other’s projects and initiatives and quickly realized that we have much in common when we use Africarxiv. We are also passionate about fostering multilingualism in African scholarship. And just for those who never heard about Africarxiv, we are an open access portal to incentivize or to make it easy for African scholars and researchers to submit manuscripts, research data sets, posters, presentations, and research output to cloud based digital repositories for better discoverability of the research output from the continent and also for collaboration across the continent and beyond. So, yeah, that’s just for the context. But now coming to you, and your work with Lanfrica. We run projects together, or basically one project is currently that we index the works that have been submitted to Africarxiv or through Africarxiv to the clock by repositories we partner with and those that cover or have a research on the linguistic aspect of African languages, or those who are in African languages, which we have two currently. So our other submissions are in English, French or Arabic. Anyway, maybe I should give you the stage to talk about your work in more detail. But basically, we are now indexing the linguistic works that come into Africarxiv in Lanfrica. But now the stage is yours, maybe please tell us a little bit about yourself and how you got to work with Lanfrica and what you guys are doing. 

Chris: Thank you very much, Jo. So, I’m Chris. Little background. I was born in Nigeria to wonderful parents in Nigeria. I did my high school in Nigeria. Then I got a scholarship to go to Russia for my undergraduate. And I think that’s my first time that’s not my first time while living in Nigeria and Nigeria, there’s about 200 plus actually 500 plus languages spoken there. But English is part of the official language, so that just kind of gives this idea. From a young age, I was exposed to a very diverse place where you can hear diverse languages. Just walking a few yards, you could hear a different language. I think I was really exposed to this multilinguality and being trying to communicate with people from different languages. But that’s one part. The other part is that it’s kind of a wrong perception that if you’re really in Nigeria, that was there in the society, that if you really want to be successful, be in the top league, you have to learn English and don’t speak your other languages. So I sort of grew up neglecting my mother tongue, which is Igbo, and just improving my English language, so I could be a native speaker or I wouldn’t have an accent or things like that. So going to Russia for my bachelor’s and having to learn the Russian language in order to do my bachelor’s also exposed me to another world of language and getting to know the culture, the language. And unfortunately, I couldn’t express my culture because I didn’t know or speak my African language. So this desire to start working on my African language is what got me to NLP. So when I got to know about machine learning, I was excited about the things they could bring. And then when I learned about natural language processing, which has to do with languages, I just got attracted to it. And my main vision was to work on my African language and by extension, work on African languages too. Because my story is not peculiar to me, it happens to many Africans who live in Africa who travel out. So yeah, that’s how I entered into a thing we call African NLP, which is basically NLP for African languages. And it’s a special division because African languages have their own special problem. They are not like English and the other languages, and they have very complex morphological structures. They are very tonal. And on top of that, they’re very low resource. So you can easily find content in English on Wikipedia, searching on Google, but it’s hard to find content in many of these African languages online, so they’re very low resource. So that’s how I started my journey into Africa, NLP with a couple of researchers. That’s how I also got to meet Mascani, and that’s also what made me create Lanfrica. So during my journey, I realized that one big problem is not that there aren’t African language resources in the world online, it’s just that they’re not discoverable. So some of them are hidden in places. They could be in a good drive or GitHub or places like that. So my second main journey was into how do I help make these things more discoverable? And when I say more discoverable, it’s not really in terms of SEO optimization and things like that. It’s more about having a kind of a central hub where you can come and easily find them. So it’s about information retrieval from my perspective, because I believe if it’s very easy to find these things, then researchers can build on them. There can be progress in that domain if it’s easy to find what has been done already, but if it’s very hard, if you have to spend weeks just finding what has been done already in that domain, you lose interest. I’ve heard a lot of students in Germany who say they want to do their master’s thesis in an African language, but it’s so hard to find related works in the language. And many times they give up, they have to produce something to their supervisor. So they end up changing languages. So it’s not that these related words don’t exist, it’s just that they’re very hard to find. They’re enclosed journals, you cannot just find it by typing on Google search. So yeah, this is what Lanfrica is about. It’s about creating a place where you can come and easily find these resources. So in order to make that work, we rely a lot on collaboration and a thing we call Linking Resources. So we take the metadata of the resources wherever they are hosted and we put it on our website. So basically our website is like our website tells you, okay, this is the name, the title of the resource, this is a short description and this is the link that takes you back to the original host of the resource. So that’s why we call it linking resources. We believe that I envision it’s kind of like a graph of links that take you to different places where you can find these resources. Yeah, so that’s what we’re doing in Lanfrica. We rely on collaboration because we have to collaborate with some of the data repositories or organizations like Africarxiv, for example, has a very huge repository of articles around Africa and around in Africa. So collaborating with, partnering with Africarxiv, some of the ways that allow us to really achieve our vision of linking them. So for example, the partnership, through the partnership we were able to link about 239 articles from Africa, linked them on Lanfrica. Yeah. So that’s it about Lanfrica and about me. 

Jo: Excellent. Thank you so much for sharing so much about the background of your work and where your interest was coming from in the first place. How many languages are represented in the Lanfrica database currently? 

Chris: So currently we have accounted for about 2189 plus languages. Yes, of course, many of these languages don’t have many resources, some even have zero. But what we tried to do was we said, let’s really go do our research and try to get as many of these languages as possible and really have them there. Because it’s not just about the languages that have resources, it’s also about the ones that don’t have resources. Because if you know about them, then you can know, okay, these ones don’t have resources. I should probably work on them. 

Jo: Yeah, absolutely. And I know from my work with Wikimedia that the Wikipedia team and communities are passionate about making Wikipedia as multilingual as ever possible. And I have huge difficulties in bringing multilingualism off the computer consensus to African languages just because I think the community base is maybe not as strong and also because of the reluctance of, as you just said, of Africans, reluctance, and maybe also not having enough knowledge of the indigenous languages, unfortunately, anymore. But for your research, has Wikipedia itself been used as a source to get hold of what languages even exist? That’s the first question and then I got some more.

Chris: So in my NLP research, wikipedia has been a huge source because most of the data sets that we use in NLP research kind of have a root in Wikipedia. So it could be Wikipedia articles that were cleaned or something was pre-processed in some way and then they give it a new name. But it’s from Wikipedia, so yeah, and especially the data sets that have some African languages. Most times it’s either from Wikipedia or from webcroud, but if it’s cleaner, if it’s more reliable than it’s mostly from Wikipedia. So yeah, Wikipedia has had a huge impact there. In terms of Lanfrica, I don’t really know Lanfrica, we link those datasets because they’re important for those languages. So yeah, but I don’t think I have used Wikipedia to know what African languages exist and which ones have not enough articles or things like that. 

Jo: Yeah, talking about Wikipedia, I just mean because I know Wikipedia itself has lists of language families and categories which then is based on the data which might be used for research. But I can imagine, I haven’t checked myself. I can imagine that also the content on Wikidata is not exhausted for African languages. What you find there and then the coverage on Wikipedia, on the individual languages, as you also said earlier, surely differs. Like bigger language groups are better covered and described also culturally contextualized, whereas smaller languages or smaller language groups are not even mentioned or have an empty stuff or have just a few lines of text to the square. And what other sources did you refer to to get to the number of almost 3000 languages? Where did you find the ones that are not in Wikipedia?

Chris: I remember when I first started and I was looking for a place that had this comprehensive list of African languages, not just the popular ones, but even the ones that no one has heard of. I was looking at places, I think I knew about Glottolog, but not that much. But SIL has a database of languages and it’s grouped by continent, so they have like a huge database complete with the ISO 6393 code. So that was the place that I found what I was looking for, the first place I found it, and that’s what I used. So yeah, that was the place I used then at that point, because even that one didn’t have all the languages. And as we were adding resources, we were realizing that this thing is not complete. So we’re kind of now adding the new languages that you didn’t have. And I remember where I was using Wikipedia to learn about these languages. Like, wow, this is a language so you could have a country and you could have like one language is spoken in this region, another language is spoken in this region. So Wikipedia would give that kind of information, like spoken in the western part or is spoken, but it’s also called this other language. So I was using Wikipedia to learn about these languages. So yeah, those were my sources. Going further, I learned that Glottalog, I think they said Glottalog has a much more comprehensive list, especially with the coding. So I plan to explore that and explore other sources. 

Jo: Quite an investigative effort. And now I know that it’s probably not only in Africa, but especially in Africa, some languages are considered dialect, which by African native speakers are considered an insult because they’re actually distinct languages, which from Western categorization point of view and maybe also for the sheer number of languages that exist on the continent, maybe some scholars from the west were overwhelmed by that. I don’t know, just making assumptions or ignorance, I don’t know. But would you like to leverage a little bit more about this kind of dissonance of what classifies as a distinct language versus a dialect? 

Chris: So I really don’t think it’s empowering to do that just because I don’t want to just use the linguistic properties of the language and just label it. I think one has to also kind of leave, experience the language, meet the speakers, try to speak it. So even when I’m trying to get the languages and put on Lanfrica, I don’t really try to put any. I try very hard not to put some additional labels or things. I just try to get it as it is and just put it there. And what’s important is the name of the language, the meaning. Wikipedia is a good source to learn more about the language when it exists and then if it has a resource or not going into whether it’s a dialect or not, I think I have the you know, I am the best person to do that. So I try not to really go there. But if it’s a language that has been because there are some languages that have gone through debates and then finally there’s like a decision on it. So you would see that on Wikipedia, for example, somewhere, then those ones, I take that metadata. But if it’s the languages that are still going on, going through some debate. So for example, there are also languages that you don’t know where to put, which language group to put them in. So sometimes you just put them somewhere or sometimes you leave them out. So language is going to debate, I just take the session information and just link them. 

Jo: Yeah, also in my experience and understanding of languages, they are also constantly evolving, like our cultures that are never status quo. So the categorization by whatever applied categories will also change over time and then challenge for other individuals or invited to the discourse.

Chris: Those who make it. 

Jo: Yeah, obviously we’re hoping that the decision making is as inclusive as can be and also involves extra speakers of that language. I think that’s increasingly the case.

So that was another question and then what are your hopes for how, what is your roadmap for Lanfrica? Will there ever be a completion of the project that you can foresee other kinds of diversity projects that you see emerge from the activities and the active and potential partners that you get to meet over the course of the activities and speaking engagements we present. In other words, might want to say what’s the potential of, like, what’s the roadmap for Lanfrica? And where would you like to see it moving next? And what’s the timeline you are working from master to master on how many resources do you think? I don’t know. It also depends on people working on these languages. They don’t really have the power. But yeah, basically, what is your approach and the foresight of the activities and projects coming up? 

Chris: Yeah, first of all, the roadmap and the foresight is a work in progress. So it’s not like a fixed roadmap. It’s something I keep trying to come to and look at from a broader perspective. So my vision with Lanfrica; Lanfrica was not just about linking resources. Linking resources was like the first step. Lanfrica is about, like, trying to do so as a researcher, when I entered Africa NLP, I’m hit with all these problems. Lack of discoverability, lack of focus. Africans don’t care about the languages or they are more into trying to get out and speak. When you say lack of data or low resource data, how do you really make more data coming? Is it just by hiring a group of people to create this data or translate some English news or some BBC News or something, or Fox News into African languages? There’s a lot we’re not so sure about. We are trying the best we can, but I believe that one way to really create African content or African data is to actually get people to use the language, to use the language, to interact with the language to use, to actually experience the language. That aside, has nothing to do with the roadmap plan, but I’m just trying to say Lanfrica, the goal of Lanfrica . When we say languages of Africa, it’s like helping to make these languages of Africa to kind of be assimilated into the world and people using it. That involves being used in language technologies in some ways, like in social media. These are some of the things that really I aspire to or the kind of the future I see in the next couple of years, maybe ten, one, two decades. This is kind of the future I envisage where you can talk with Alexa in your African language, where you can see content on the news in your language. It’s an ambitious vision. So I try to also look for pragmatic things that we can do. Currently, what we’re doing in Lanfrica, so we have linking resources there, so we’re creating a platform to make it easier for people to link their resources themselves and for people to find resources. So that’s there and we’re also looking into automatic ways of getting these resources, probably kind of crawling the Web, finding them and putting them so that you don’t have less manual effort there. Of course, there’ll be manual reviews and stuff. So linking resources is there. But the other things that Lanfrica is trying to branch into, going to or look into so now we have this thing called Lanfrica Talks, which is trying to address another problem where there’s a lot of hype and attention on large efforts in NLP or in languages in general. You only see results of people, you only see tweets and attention from people who have published papers and big conferences or companies that have built huge models. But with the small efforts of people, you don’t see them anywhere. So we’re trying to tackle that problem by creating a space where these people can share their efforts and we host them and then we release it online so that the world can watch and posterity can also get to know about this thing. So that’s one space we are in, another space we’re really investing in and currently working on is especially data selling, to data Marketplaces. And we’re trying to work in that area for African datasets because another problem I noticed is, you have researchers, linguists, people with data. But they are scared to put their data anywhere because they know that unfortunately. Eventually what happens is your data is out there on the Web and some big people organizations with the resources crawl your data, use your data, train the models. Create the service and sell it back to you. This is a thing that has been going on since and it’s a thing that keeps going on. And so you have some people, they don’t want to put their resources out there and it feels bleak. But finally, Lanfreaka is seeing something we can do about that in the Data marketplace section. So we’re trying to do that’s a big thing that I’m very passionate about. And, yeah, we’re still doing the market research, trying to understand how really we can help and things like that. But that’s one place I’m really looking into.

In the future, when you hear Lanfrica, it’s about this effort to really do something about the state of African data sets in the world. It’s easy to talk about it and to let people know and hope that the policies or the government will do something. But I think Lanfrica is me taking concrete, pragmatic steps towards that. Of course, it’s not a big step, but I think it’s helpful. No matter how small it is, it’s actually doing something about it. 

Jo: Yeah, absolutely. And you mentioned quite a few highly important issues and also opportunities to protect. What we’re also concerned about with Africarxiv  is data ownership. Like when there’s research about any African. Not only languages. But as well. But only African topics. Also African animal and plant species ownership should and must remain with the African people but as many of us will realize and quickly understand and find out quite normally to happen is that the ownership is transferred to the researchers and then the western industries and being exploited and then no African stakeholders would have any say on the products that are being developed so similar to what you just described. And by our approach to enable African researchers basically making it, they can already do that, but to show them the venues where it is possible digitally to apply open licenses where the ownership remains for the people in the research community and that is also embedded in the national and institutional settings. Another level of complexity then is which I also talked with Nicholas Outa in the previous episode Helicopter Research, which what he said also sometimes within the country where researchers helicopter research is normally known as some researchers come from Western countries work together with African researchers collect data and gain regional and local knowledge but then publish the results, often without their African colleagues. And those then might be mentioned in acknowledgment, but otherwise there’s no participation in the ownership of the data let alone being acknowledged for the contribution to the research project. And that is also hopefully slowly changing. But there are still cases of that happening and then we can also link the episode in the show notes. He mentioned that sometimes it’s also difficult for in this case Kenyan researchers to work with local communities and then find themselves making the same mistakes like not knowing or having no means in the scholarly system. Just knowledge. The fishermen and non scholars they work with or community members. But also for that there are solutions now and he himself and his team and the people that he works with. They are doing wonderful work and fully engaging with the communities and all the stakeholders of the research project. Informing them. Empowering them. Informing them of any research that is already published or in their own results and how they can improve fish farming in the region. Some of the research and knowledge that is basically published as research which the fishermen might not be aware of, why the fish species are declining, what the reasons are and things like that. And then of course also the other way around, the fishermen sharing their experience and very well understanding the connection between climate change and the shrinkage of the lakes and rivers.

From Canada, Australia and other countries. There are best practices and also evolving standards and categories but also archiving systems for ensuring that knowledge remains with the people and the knowledge holders and that attribution is being shared in accordance to the actual contribution that is being executed throughout the research project. But there’s still a lot of unawareness of these emerging features and standards and projects and best practices. So there’s still some work to do for us and others to inform about the possibilities to ensure ownership of research data and knowledge systems and to protect knowledge from misappropriation in any regard. Okay, but let’s go back to languages. Was there anything that came to mind?

Chris: When you were saying this I was thinking like you’re very right. And part of the thing with AI in Africa is that these policies, these rules and these regulations have been set up in other countries. So you have GDPR in Europe, you have some rules in Canada and the things you mentioned, they are not enforced in many African countries. I use the word enforced because I’ve come to see that some of them are there. So you have the Malabo convention and others. But the good news is that the people involved, the officials, the key researchers and stakeholders are actually now working on trying to write up some of these policies. So it would take time. And what I also learned is that as researchers or people who want to do something, we don’t have to wait for that. We can do what we can do while waiting for that. 

Jo: Yeah, thanks for adding that. And they are very well, I’m just sharing two links. Like one of the care principles. The complimentary care principles which we have also talked about in previous episodes before. Which concerns more ownership rights and ethics and then as well. Local context is the example from the United States in Canada. And I think Australia has been working where they work with indigenous communities and try to protect traditional knowledge and indigenous knowledge which doesn’t fit as scholarly ownership of data ownership of knowledge schemes. And then you also mentioned what should totally be explored is existing policies in the respective country where the research is being done that often exists and researchers have a duty in farming themselves and getting the rights and approvals by the authorities to conduct research in a certain manner and always apply the highest possible ethical standards and legal standards and not only to the western perspective. But in agreement with the stakeholder and cultural background that is concerned.

Yeah, and also just use our brain to what is an accurate sharing of and respectful project approach, approach in the first place. And there are plenty of resources and best practices that can be found online and through various institutions who concern themselves with these. And yeah, as you said, the difficulty might be that international standards might not be enforced on the national level either way, but yet there are certain opportunities to consult with the stakeholders concerned to find agreement and have more use before the project starts. So how the knowledge can be extracted and how it’s been shut back to the community. So not extracted, but accumulated.

Chris: Yeah. Another thing I would like to talk about is I would also like to ask you, what do you think? It seems to me that some of these community first initiatives don’t always attract investors and funders. For example, if you look at NLP right now, there are so many startups that just keep popping up trying to use models to create services and very good services like paraphrasing summarizing down to generating text. So using models. The NLP models and even though these models have been shown to have bias but you have startups and these startups get some really good series and funding but then when you have initiatives like this so people just trying to create a platform for open access or to make some marginalized communities more discoverable when I browse through TechCrunch. Just looking at the list of startups I don’t see things like that and so it seems to me like these initiatives of trying to maybe create a system where maybe like Africa is trying to create open access more information. More access to information about Africa or Lanfrica it’s not so easy to attract investors. What do you think about that? 

Jo: Yeah, it’s the same experience but I think it’s not out of a lack of interest for many potential donors but yeah, I think what eventually attracts investments and donations is the prospect of economic revenue, unfortunately. And I think value based initiatives often cannot provide for that, like when will this be self-sustaining because it is a long road to work and I think many investors just don’t have the patience for that to materialize. There are a few niche communities also to look into funding streams and community based funding approaches also how do you say we use open collective for an open collective? Collective is many things and also fundraising platform for open source projects but also a means to make cash flow transparent to basically showcase efficiency and expenditure and revenue that’s being generated even if small big money philanthropic project or project that are built towards depending on philanthropic investments or donations are not as strong on the economic food but that’s needed in the system for sustainability of the services and I think like for myself researchers and NGO people and people who are values driven often fear commercialism and capitalistic approaches whereas maybe a healthy mix is in between and this is the thing where many representatives in both extremes are not the extremes but from both professional settings are working towards and that then can have a good mix of knowledge about what’s needed. What project planning? I’m not saying we don’t do that. We’re also researchers. We can design projects and follow through.

Yeah, so I think basically what it was done to us maybe what keeps the project alive if we look at it from a nutritional point of view and not so much financial but from the finance because that often triggers a lot of fear in many of us to speak about a lot of misconceptions or bad experiences and observations made. Yeah, I think it’s a mindset thing from my personal experience to not wanting to worry so much about the finances but it’s needed as a tool to keep the project alive and growing. And everybody said everybody works on the project, they put food on the table, keep going. 

Chris: Yeah, I agree with you. 

Jo: But I think your question was like there’s not too many opportunities in Rent proposal or proposal.

I guess there should be more. But I think also in the times that we live in today with climate change all around and so many extremes going on and in a digitally interconnected world where things feel as if there are doors, it when something happens in the other end of the world, there’s also lots of opportunities for many wake up calls to happen. And I think that’s also happening. 

Chris: Yeah.

Jo: What I’m realizing is matter of opening up to staying open to minor initiatives and joining forces like we did overlap where we can support each other, where can we support each other and what are each organizations’ strength and then submitting funding proposals together to build a much stronger case, which was a much better outcome as each of our organizations.

And that’s what’s happening. I mean, small scale and I think we also grow into bigger budgets over the years with experience. It’s as you said yourself, like observing the market, being in the market for some time, learning the rules, learning about the stakeholders, and then being a stakeholder within networking. And that’s how eventually yeah, the money, the drive, the project success also will be in reach. 

Chris: Yeah, I completely agree with you. 

Jo:I think similar things I’ve also seen in research projects, like there are always trending topics like cancer research will always get money because there is a demand for it, there’s an industry for it. But then more exotic topics like I was busy or I was engaged in the pollution development niche of researchers. Which five or so groups around the world study those topics, basically decoding the animatry of life. So what’s quite essential and I think now the community has grown bigger but at the time we’re really not so many. I mean, maybe more than five, but less than less than 100 researches in total with other PhD students, NCIS involved. And for us at the time, it’s always difficult to find money to keep the research going for any Drosophila lab or with established model organisms, mice starting cancer. I mean, there’s also high competition in the number of research that’s competing for the money. But there’s a bigger cultural platform in our case, I think it’s also more like developing services, like other than academia where we are service providers and stand alone organizations at the intersection with academia. It’s also a matter of how we can monetize what we’ve learned through the organization. And it’s not that you also mentioned something that you’re investigating opportunities they are all protecting from misappropriation. But yeah, there’s nothing wrong with fair fees and amounts for fair services. I also like digital tools. This is also a topic I often talk about digital tools in research, across the research workflow and how they are funded and sustained financially. The best bet is always a mixed approach not to rely on just one funding source, but to do more than one and just keep all more than one also happy and flowing so that when one source seizes to provide the stream of influx for finances, then the project will not die. I can continue even though with less investment, but then new investors can come in. 

Chris: Yeah, 

Jo: I think it’s a learning that many researchers have to learn with all the skills that we’ve established, like a tremendous skill set that each researcher comes with but then running our own projects which are kind of an intersection with academia and entrepreneurship or nonprofit organizations, basically all other kinds of workflow. But it’s not so different. But there’s another dimension to complete. 

Chris: Yeah, I completely agree with that.

Jo:  And sometimes it’s frustrating and then there’s also opportunities that arise. 

Chris: That’s true. It’s not a straight path, it’s filled with ups and downs. But it’s all part of the experience and part of growing, I believe. 

Jo: I wanted to ask you from a personal view and also maybe experiences, career work, how would you say this culture is embedded in language? I just want to contextualize this because I’m personally involved and also with both our teams with Access 2 Perspectives and Africarxiv, we’re involved and engaged in multilingualism and translating research initiatives, one of which is also with Masacana. And one thing that I’ve come to realize is that translating research can never be fully holistic in a sense because language itself carries so much cultural context and there is also oftentimes in many disciplines. If not all. If you look closely enough. By having to translate into English or another language from a regional research study that was conducted in another language and then you translated into English loses a lot of the cultural context. In my view. And kind of anticipation or assumptions. And then you might argue about physics, math and other more technical research topics. Maybe not and maybe yes. I don’t know. So what’s your impression?

Chris: I will start with a personal point of view because as someone who I went to Russia and studied in Russian and then I also came to Germany even though I’m studying in English. So I’ve had interaction with different cultures and languages. So what I have seen is that the culture is what the definition of culture is. Like the values, the traditions, what defines the people, their way of life. You could also include their way of thinking. So all these things, it’s very abstract. So you cannot see it, you cannot touch it, you cannot interact with it, with this culture of the people. But the language helps you to kind of now interact with that culture. So I’m not talking of translating, I’m talking of speaking in that language, trying to learn that language. When you try to learn Russian, for example, you kind of shift your mentality from the way you think in English. So the mentality shift is kind of bringing you closer to the way they express themselves, which is the culture. So this is how language in one way, in my view of my experience, that language helps you interact with, connects with the culture.

Another way is, I see that the language and the culture, different cultures are like different worlds, and you could be in your world and never know of another world. But when you learn the language, then suddenly you’re like, whoa, there’s this whole world out there world, because it’s a different way of thinking. So when I got this realization I was learning Russian, and usually when I’m searching for things on Google, I type in English and Google gives me English responses. And you get many Western views, European views, on this question. But then I think once I tried to search for something in Russian, so I tried to search for the same thing in Russian because I wasn’t getting the thing I was looking for. And then I searched for something in Russian, and it’s like a whole new world out there. I was like, Wait, what? So even this web has like, a different world, and the world is different, not just because it’s in Russia, because even the way of explaining where they’re coming from to explain the same meaning, it’s completely different. So you can see this in African languages. Like, if you’re translating English to African language, mostly what you’re doing is just transferring the culture or the Western perspective into another language. It’s not really interacting with the culture of the African language. So this might explain why you’re saying, like, it loses some value. Because even if you transfer, you cannot transfer everything so well, but we can manage it. But if you create content in the African language, like the person you talked about interacting with, the fishermen, talking with them, it shows you a different way of thinking, very interesting mindset, perspective to life that is so different from most of the Western point of view. And you can only do that if you interact in that language. This is why if you talk to a man in a language he understands, it gets to his head. But if you talk to a man in his language, it gets to his heart. So if you speak to someone in their language, then they can give you from their heart the things, the way of life, way of thinking, which you would surprisingly see that it’s completely different from the way of other languages. So I see language as a way to communicate with this abstract thing called culture, which is just the mentality, the values, what perspective on life and all these things. And you cannot see these things, you cannot connect with it, but through language, you can communicate with that. This is how language connects to culture, in my opinion. 

Jo: Yeah, I share many of your experiences. And when I learned Swedish and my undergraduate studies, it felt like a whole new world was opening up to me. And other than seeing the landscape, like, actually diving into the culture by knowing the language to some extent, I’m not a non-speaker level, but I knew well enough that after saying a couple of sentences, the Swedes would not ask, which country are you coming from? Rather, which part of Sweden are you from? But I haven’t said a lot until yeah. And also, like, the personality shift, like speaking about the language. But I have to say that actually also happens to me when I speak another German dialect. But I cannot help but adopt a dialect that’s kind of spoken to me, too, even if I don’t know it very well. But I mirror the same kind of intonation that the data comes with. And I’ve lived some time in Bavaria, and not so the other south southern region.

Yeah. And it’s actually difficult for me to keep speaking what’s known as High German or classical German. Yeah. I don’t think my personality shipped as much as then with dialect as it does with speaking other languages, because by being in the country with the people and speaking of the language, you actually also experience to some extent the culture and the upbringing, like what they went through, of course, not living through the whole thing, but understanding the context of the region to a certain extent. And that comes with some mind shift. It’s pretty hard to explain.

Chris: Yeah. 

Jo: For somebody who didn’t experience it. Yeah. And it happened suddenly for me, also with English. And with English it’s more complicated because there’s so many dialects, if you wish. Are there so many types of English? Canadian, Australian, British American, like, inside brickiness Britain has so many dialects and forms and types. And then also I would also say that Kenyan English is Nigerian English because the English was then also informed by the original words that found themselves into the spoken English. South African. So there are so many varieties of English. But I remember, I think I was now in school or during school time when I spent some weeks in England for language courses, like, suddenly got me. It’s also when some people say, okay, now I started dreaming in that language, there’s a threshold or gate that you at some point work through to enter the culture through the language.

Wow. Okay. That’s almost all. It is very much philosophical from being highly technical in the beginning of the conversation, but I think it brings home the point that we’re both making about the importance of language diversity. Preserving or not sure if preserving is the right word, but documenting language as much as possible, I could do. Or kind of building a catalog of the documentary works of the African languages. So just to come back to Lanfrica for you, it’s not important that there is or not yet, maybe, but better say that each grocery on a certain language or work that you index into the Navrica database is of a certain data from it or in a certain repository, but rather to showcase what’s out there. 

Chris: Yes. As long as it’s about African things for now. African languages. Yeah, of course.

It has to be like an article. It has to be what’s the word? It has to be an article that’s helpful and useful. For example, for example, we cannot take an article that is, I don’t know, lambasting African languages I put there. No, it has to be an article that’s beneficial to the African language community. So that’s just what I want to say. 

Jo: All right. Okay. So we will put the link to our website, of course, in Show Notes. So whereas listening, you can find all the mentioned resources and organizations, show Notes and the affiliated blog post. But we also learn a bit more about yourself and your online profiles where people can get in touch with you. I’m very glad we’re collaborating and have more opportunities to engage. You are most welcome back to the show anytime you want to share a milestone from Lanfrica or in other projects that you’re working on. Yeah. Thanks for joining today. Thank you.

Chris: Thank you very much, Jo. Thank you for hosting me. I’m very happy to be here and also very looking forward to more projects than we do with Africarxiv and yeah. Thank you so much.

Jo: Thank you.

On-topic questions

  • Please tell us about your background and research interests and how you got interested in African language documentation. 
  • our partnership between Lanfrica and AfricArXiv.

References (related research articles)