Preethi, who goes by a single name, as is common in the region, is among the 70 workers hired in Agara and neighboring villages by a startup called Karya to gather text, voice and image data in India’s vernacular languages. She is part of a vast, unseen global workforce — operating in countries like India, Kenya and the Philippines — who collect and label the data that AI chatbots and virtual assistants rely on to generate relevant responses. Unlike many other data contractors, however, Preethi gets paid well for her efforts, at least by local standards.
After three days of working with Karya, Preethi earned Rs 4,500 ($54), more than four times the amount the 22-year-old high school graduate usually makes as a tailor in an entire month. The money is enough, she said, to pay off that month’s installment on a loan taken to partly repair the crumbling mud walls of her home that have been carefully patched up with colorful saris. “All I need is a phone and the internet.”
Karya was founded in 2021, before the rise of ChatGPT, but this year’s frenzy around generative AI has only added to tech companies’ insatiable demand for data. India alone is expected to have nearly one million data annotation workers by 2030, according to Nasscom, the country’s tech industry trade body. Karya differentiates itself from other data vendors by offering its contractors – mostly women, and mostly in rural communities – as much as 20 times the prevailing minimum wage,