Bloomberg, headquartered in New York, is a global leader in business and financial information and services. In 2022, the company reported revenue of USD 12.2 billion and had a global workforce of over 19,000 employees[1]. The company's strategic emphasis is heavily placed on technology and innovation, as exemplified by its workforce comprising over 8,000 engineers, developers, data scientists, and technologists. Technologists account for 42% of the total employee base13. Additionally, the company prides itself on being an innovative first mover, who is “always going where others aren’t, can’t, or won’t”[2].
Bloomberg’s employees perform financial data analysis, conduct research, sentiment analysis, news classification, and question answering which, according to the company, takes up a substantial workload[3]. Additionally, users of Bloomberg Terminal, a platform for financial professionals who need real-time data, news, and analytics, invest a significant amount of time searching for information within the platform. Given that generative AI is often referred to as the new frontier for productivity, it is not surprising that the company has been strategically exploring how to leverage this technology to enhance the productivity and efficiency of their business[4].
With Bloomberg’s aspiration to be a first mover, the company released BloombergGPT on March 30, 2023. Bloomberg created their GPT from scratch, utilizing their domain-specific proprietary data and general-purpose data during training of the model[5]. The company collaborated across various departments and teams to build the model from the ground up. The team combined a 363 billion token dataset, which Bloomberg curated, with a 345 billion token public dataset, ultimately creating a dataset with over 700 billion tokens for training the model17. Notably, nearly half of the data used for training Bloomberg's GPT model originated from nonfinancial sources obtained through web scraping. These sources encompass platforms such as GitHub, YouTube subtitles, and Wikipedia[6]. The other half is a combination of proprietary data collected by Bloomberg in the last forty years and public available financial data, such as Bloomberg press releases and news articles15. In this context, BloombergGPT's competitiveness resides in their vast datasets, a feature various sources have also referred to as their ‘crown jewel’[7].
Nevertheless, the approach of building in-house is not without its risks. It requires significant financial investments in both time and resources, and necessitates that Bloomberg possess and sustain top talent in the AI and engineering domain.
[1] Forbes: ”Bloomberg”, n.d.
[2] Bloomberg: ”Pushing the Boundaries on Innovation”, n.d.
[3] Bloomberg: ”Introducing BloombergGPT, Bloomberg’s 50-Billion Parameter Large Language Model, Purpose-Built from Scratch for Finance”, 2023
[4] McKinsey: “The Economic Potential of Generative AI: The Next Productivity Frontier”, 2023
[5] Medium: BloombergGPT, the First Large Language Model for Finance”, 2023
[6] CNBC: ”Bloomberg Plans to Tntegrate GPT-Style A.I. Unto Its Terminal”, 2023
[7] Forbes: “Bloomberg Uses Its Vast Data to Create New Finance AI”, 2023