Open source AI development efforts are on par in terms of importance as those by closed-style vendors such as Anthropic and OpenAI.
Open Source often sets the pace for industry, uncovering new research and methods that later find their way into mainstream applications — like how Stable Diffusion and Midjourney laid the groundwork for AI image generations.
But there’s a worrying absence in open source AI development: Women.
Researchers from Oregon State University mined mailing lists of open-source projects to determine the number of women participating.
The findings were stark: Just 8.27% of content was authored by women, 6.63% of those posted one message, while just 2.5% engaged more than ten times.
Another group of US-based researchers found that only about 5% of open source projects have women as core developers, while female authors make up less than 5% of pull requests in open source projects.
More broadly, a 2020 World Economic Forum report, found that women make up only 26% of data and AI positions. And The Alan Turing Insitute’s Where Are The Women Report suggested that women make up around 17% of participants across the online global data science platforms.
Notably, the Turing Institute found that women make up only 8% of users on Stack Overflow, a platform often compared to Reddit but specifically tailored to the coding community.
This lack of participation is concerning, especially since Stack Overflow's vast content library is being integrated into ChatGPT through a partnership with OpenAI. If ChatGPT is primarily learning from a male-dominated user base, how can it develop the ability to problem-solve from a more diverse range of perspectives?
emPOWERED is proud to showcase the women making a difference in the open source space AI space, highlighting the importance of diversity in technology development.
Subscribe today for free
Mer Joyce
Founder of Do Big Good
Tell us about your journey in open source AI
I became involved in open source AI by being asked to lead co-design for OSI's Open Source AI Definition. Like the Open Source Definition before it, the Open Source AI Definition will have a global impact. It needs to be based on global input. And my co-design firm, Do Big Good, does that work.
Why do you think there’s a lack of women in open source development and open source AI? What are the main barriers preventing women from participating in open source AI
I hear a lot of talk about the lack of "pipelines" for women into AI, both in industry and in academia. But I don't focus on that. My job as a co-designer is to find the people who are being excluded from these types of technology decisions and actively invite them into the process. The talent is out there. It's just being overlooked. By not inviting women in, we would be the ones missing out.
What can be done to improve and encourage more women in AI, data science, and wider open source development?
I think my answer above responds to this question. You need to find women who are already interested, already doing the work, already displaying talent. You need to invite them into projects, into programs, into companies, into conferences. It's not rocket science. And you have to be intersectional. We want women of color, women from the Global South, and trans women to also be actively included in AI, data science, and open source development.
Finally, who is your female open source hero?
At the moment, it's split between Smita Gupta and Tarunima Prabhakar, both of whom are doing amazing open source AI work in India.
Smita's open source AI project, OpenNyAI, is developing digital public goods (DPGs) that transform how citizens in India experience law and justice.
Tarunima's open source AI project, Tattle, builds citizen-centric tools and datasets to respond to inaccurate and harmful content online.
Connie Yang
Principal AI and Data Science at DesignMind
Tell us about your journey in open source AI
I started my career as a data scientist at Microsoft. I did my undergrad in math and computer science at Carnegie Mellon and did a data science internship at Microsoft before going full-time. Open-source technology, especially AI, has been a critical part of my work. During my time at Microsoft, there were early developments of InterpretML, the open source technology for explainable AI and so there are a lot of very exciting things in that space that we were able to leverage to help explain our work.
Throughout my career, I've leveraged a variety of open source tools to build scalable AI solutions, from machine learning frameworks to cloud infrastructure platforms. Currently at DesignMind, we're using a lot of open source AI models like Llama and Mistral to develop generative AI retrieval augmented generative (RAG) solutions for our large enterprise customers.
I've been focused on evaluating a lot of open source AI to assess the accuracy of generative AI applications. I'm frequently active in GitHub repositories, and it's an incredible resource for collaboration where I'm able to engage with like-minded experts from around the world, whether we're troubleshooting issues or exploring interesting findings in the application of these models.
Why do you think there’s a lack of women in open source development and open source AI? What are the main barriers preventing women from participating in open source AI
One significant issue is visibility — there just aren't enough high-profile female role models in this space, which makes it difficult for women to see a clear path for themselves. I also think the culture of open source development can feel like a boys club, where a lot of the competitiveness can overshadow collaboration, and that makes this space feel like less welcoming.
Timing is another big challenge too. Open source work is done on a volunteer basis, outside of your normal working hours, and this can be particularly difficult for women who are balancing other responsibilities, for example, for women who become new mothers, that available time shrinks even further.
I co-founded Microsoft's Women in Data Science community. We co-founded the community because we were often the only females. In fact, I was the only woman on my team reporting to my manager with 13 other data scientists who were men.
We built the grassroots movement up from just the two of us to now a 1,000-plus community. My fellow co-founded is still at the company and we’ve held panellist talks and workshops at the internal machine learning and data science conference every year.
We've heard from women who want to contribute to the open source space in machine learning and AI and have talked about how difficult it is to juggle the work responsibilities if they were to go on maternity leave and how that affects their careers. A lot of the cultural and structural factors often discourage participation and can limit a lot of opportunities for women to gain visibility in the community.
What can be done to improve and encourage more women in AI, data science, and wider open source development?
My big thing is hackathons — they are a great way for people to foster collaboration and to come together and work in real-time.
I've been a part of enough hackathons to know that most great ideas stemmed from some hackathon. To hold more hackathons to encourage women to participate in major projects like responsible AI, as a big part is people don't know which projects they can contribute to.
Hosting a hackathon for such projects would encourage people to take part, and then in touch through Discord channels.
Finally, who is your female open source hero?
Margaret Mitchell, who used to be the CO lead of Google's ethical AI team. She has been a real driving force in advancing a lot of AI ethics. Her work on open source projects related to explainable AI has been instrumental in making like a lot of the AI systems we see today more transparent and accountable. Her commitment to building tools that prioritise fairness and transparency aligns perfectly, with the open source philosophy of collaboration and shared progress.
Another person that I have a lot of respect for is Dr Rachel Thomas, who co-founded fast.ai. She has done a lot of work in democratising AI education and accessibility within the space. She's significantly lowered the barrier of entry for machine learning, making it a lot easier for women and underrepresented groups to engage in AI and I think that somebody like her should follow suit doing that for the open-source community in AI and lower the barrier of entry for people of diverse backgrounds to be able to contribute to these projects.
Priya Shivakumar
COO of Lightning AI
Tell us about your journey in open source AI
My journey started in engineering back in the 2000s. More recently, I’m a product leader and enjoy working in open source technologies aimed at solving complex problems.
Before Lightning AI, I was at Confluent, the company behind Apache Kafka. So my work in the last eight years across both companies has been around democratising technologies for wider accessibility.
At Confluent, I played a pivotal role in making Apache Kafka accessible to not only large enterprises but also to individual developers and startups. We built and launched serverless Kafka which didn’t require large clusters to run. So it significantly lowered the entry barrier allowing the development of real-time applications and microservices with minimal investment.
At Lightning AI, we’re the company behind PyTorch Lightning, the popular open source framework that’s been downloaded over 160 million times. It helps everybody from communities to researchers to Enterprises, train and fine-tune AI models at scale.
In my capacity as a product leader, I and the broader team are working on increasing the accessibility of AI tools and resources. The idea is to foster innovation across the entire community and ecosystem, and we want to make deep learning and AI more accessible.
Why do you think there’s a lack of women in open source development and open source AI? What are the main barriers preventing women from participating in open source AI
The lack of representation is an extension of the representation within engineering as a whole. AI in the past has largely been in academia but now that it’s in business, it is extremely critical to get not just women, but more representation into this space. We want to represent the interests of the entire world, we want AI models trained to serve us all. Having women and people from all races is important to make that inclusive and ensure that the models represent all of us.
What can be done to improve and encourage more women in AI, data science, and wider open source development?
It's always easier to take a path when you see that somebody has paved a path and there are people ahead of you, so highlighting women in the field is going to be helpful. But also creating opportunities to get involved, not just in core AI research, but in all the aspects related to that, whether it's product, sales, marketing, or engineering.
At Lightning, we’re making concerted efforts: more than 30% of our employees are women, we represent about nine countries and we come together to use and build AI internally.
There has to be more education and more concerted efforts at recruiting. We are looking at getting into schools early with things like academic outreach.
More broadly, it’s about creating awareness around AI. For instance, here in Palo Alto, we have an AI meetup on the weekends and my 15 year-old daughter goes. She’s even used our product after taking a tech course over the summer. Democratizing these tools and getting them in the hands of people, making it super easy and intuitive, where it's whether it's a 15 year old getting into it, or an AI researcher, that's going to be critical.
Finally, who is your female open source hero?
I’d say it is Neha Narkhede, the co-founder at Confluent. Jay Kreps, Jun Rao, and Neha built Apache Kafka while working at LinkedIn, open sourced it, and then founded a company around it. I worked with her really closely to build the serverless Apache Kafka, which even today, is one of the main ways that people get started with Kafka. It was really great, the whole company was very passionate, just like Lightning.