Alibaba staffer offers a glimpse into building LLMs in China

Topics

Latest

Amazon

Image Credits:gorodenkoff / Getty Images

Apps

Biotech & Health

Climate

Close up of hands typing code on a keyboard with code appearing on monitor in front of the keyboard.

Image Credits:gorodenkoff / Getty Images

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

back

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

result

Startup Battlefield

StrictlyVC

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Taiwanese tech society are gathering all sort of resource and endowment to narrow their break with OpenAI , and experiences for researchers on both side of the Pacific Ocean can be surprisingly similar . A late 10 post from an Alibaba researcher offers a rare glance into the life of develop large language fashion model at the eastward - Commerce Department firm , which is among a raft of Taiwanese internet giants striving to match the capabilities of ChatGPT .

Binyuan Hui , a born nomenclature processing researcher at Alibaba ’s large language model squad Qwen , sharedhis casual agenda on X , mirroring apostby OpenAI investigator Jason Wei that went viral recently .

The parallel glimpse into their typical day reveals dramatic similarities , with heat - up times at 9 a.m. and bedtime around 1 a.m. Both start the day with meetings , follow by a flow of dupe , example training and brainstorming with fellow . Even after mystify home , they continue to run experiments at night and ponder on ways to raise their example well into bedtime .

The noteworthy difference are in how they choose to characterize leisure time sentence . Hui , the Alibaba employee , mentioned reading research papers and shop X to view up on “ what is happening in the earthly concern . ” And as a observer point out , Hui does n’t have a drinking glass of wine after he arrives home like Wei does .

This acute work regime is not strange in China ’s current LLM quad , where technical school talent with top university degree are join technical school party in droves to build competitive AI models .

To a certain extent , Hui ’s need docket seems to mull a personal driveway to match ( or at least the social media appearance of doing so ) , if not outpace , Silicon Valley companies in the AI space . It seems different from theinvoluntary “ 996 ” work hoursassociated with more “ traditional ” type of Formosan internet business that involve heavy mental process , such as video recording games and due east - commerce .

My typical 24-hour interval as a Member of Technical Staff at Qwen ( Just for myself):[9:00am ] Wake up , might stay in bed for an extra 15 mins.[9:30am ] Taking a hack to work , browse X to catch up on what ’s happening in the world , suss out out@_jasonwei ’s latest post.[10:00am ] Work … https://t.co/7o47EQrWcW

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

— Binyuan Hui ( @huybery)February 21 , 2024

Indeed , even renowned AI investor and computer scientist Kai - Fu Lee puts in an incredible amount of endeavour . When IinterviewedLee about his freshly mint LLM unicorn 01.AI in November , he admitted that belated hour were the norm , but employee were volitionally working hard . That 24-hour interval , one of his staff messaged him at 2:15 a.m. to extract his turmoil about being part of 01.AI ’s mission .

Outward displays of vivid study ethic speak to the urgency of the remits lay out by tech firms in the country , and subsequently the speed with which those firms are now range out Master of Laws .

Qwen , for example , hasopen sourceda series of foundation mannequin train with both English and Formosan data . The number of parameters — a figure that speaks to the cognition the model gains from diachronic breeding data that defines its power to get contextually relevant response — is 72 billion for the largest of these . ( For some context , GPT3 from OpenAI is think to have 175 billion ; GPT4 , its latest Master of Laws , has 1.7 trillion . However , it ’s arguable that the aim of a particular LLM will be the more important key to decoding the value of high parametric quantity turn . )

The team also has been quick to introduce commercial-grade practical program . Last April , Alibababeganintegrating Qwen into its endeavour communicating platform DingTalk and on-line retailer Tmall .

No definite leader has emerged in China ’s LLM space so far , and venture capital firms and corporate investors are spreading their stake across multiple competition . Besides building its own LLM in - house , Alibaba has been aggressively commit in startup such asMoonshot AI , Zhipu AI , Baichuanand 01.AI .

face contender , Alibaba has been trying to carve out a niche , and its multilingual move could become a merchandising gunpoint . In December , the companyreleasedan LLM for several southeasterly Asian language . call SeaLLM , the model is adequate to of processing information in Vietnamese , Indonesian , Thai , Malay , Khmer , Lao , Tagalog and Burmese . Through its cloud computing clientele and acquisition of e - commerce weapons platform Lazada , Alibaba has established a ample step in the region and can potentially introduce SeaLLM to these services down the road .

How China is establish a parallel productive AI universe

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

Topics

More from TechCrunch

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI