The Battle of the AI Platforms

How Samsung Navigates the Maze of Big Data, Generative AI, and Vendor Evaluation

Introduction: Deciphering the Chaos of AI Platforms 

Imagine you’re Samsung’s Director of Data Intelligence, and your mission is to build a tech stack capable of taming the chaos that is generative AI, machine learning (ML), big data, and all that comes with the baggage of modern tech buzzwords. It’s like being given a treasure map where the X moves every day, and you’re armed with a crayon drawn on MS Paint.

Let’s dive into how Samsung evaluates AI vendors, allocates budget, and keeps up with the evolving landscape of labeling, cloud services, and on-prem solutions. It’s not only about who’s best today—it’s about who will still be relevant tomorrow.

1. Vendor Evaluation: More Vendors, More Problems?

When evaluating vendors like Scale AI, Snorkel, and Labelbox, Samsung doesn’t just settle for the highest-rated tool. They approach each AI vendor as though they’re picking out new sneakers—they need to be able to handle different terrains, fit comfortably in the current tech stack, and not make the overall operation clunky.

For example, Scale AI has been a steady choice with an accuracy score over 0.76 for labeling tasks, but the Director points out its human-intensive processes can lead to delays. That’s why Samsung uses Snorkel for text labeling—it’s like opting for Velcro straps instead of dealing with fancy knots.

Vendor evaluation goes beyond accuracy; operational efficiency, costs, and the type of data being processed all matter. Each platform has its strength, whether it’s handling text, audio, or video, and the right balance is key.

In evaluating vendors like Scale AI, Snorkel, and Labelbox, Samsung prioritizes accuracy, cost efficiency, and operational flexibility. Investors would value metrics on accuracy scores, vendor-specific operational costs, and scalability ratings for text, audio, and video labeling tasks, helping to illustrate how Samsung maximizes performance across different data types.

Table 1: Comparison of AI Labeling Vendors

Samsung’s AI Vendor Strategy: Navigating a Crowded Field

2. Budget Allocation: The Eternal Juggling Act

Samsung’s AI budget resembles a complex Venn diagram on its own—about 40% goes to infrastructure, compute, and storage, primarily across AWS, GCP, and on-prem setups. But why not pick just one? It’s all about adaptability.

AWS has capabilities that GCP doesn’t, while GCP scores better in cost efficiency for optimization tasks. Meanwhile, their on-prem solutions—leveraging Pure Storage, WEKA, and others—help to mitigate the ballooning costs of GPU consumption. If Samsung were to bet it all on a single horse, they’d risk limiting their adaptability. In a fast-moving tech world, locking in is like getting a tattoo on a first date.

Storage and compute aren’t the only budget hogs. Data processing accounts for another 20%, with Databricks and Snowflake in the game. As Snowflake use increases, so do costs, which have gone up by 16-17%. Budget allocation in the AI world is akin to budgeting a summer road trip—every unexpected ice cream cone (or GPU) adds up.

Samsung’s complex budget allocation reflects the need for flexibility across various AI platforms. Investors would find metrics such as infrastructure spending percentage, storage costs per provider, and annual budget increases for data processing insightful for understanding Samsung’s financial commitment to AI scalability.

Table 2: Resource Allocation in AI Infrastructure

Samsung’s AI Budget Allocation Strategy

3. Data Labeling in a Generative AI World

At first, Samsung hoped LLMs (large language models) would solve all their labeling problems. Just throw in some prompts, they thought, and poof—the magic of automation would solve everything. Fast forward three years, and they’re still spending up to $700,000 annually on data labeling across Scale, Snorkel, and Labelbox.

Why? Because LLMs, despite all the hype, still have hallucinations and misinterpret things like my 5-year-old cousin trying to explain quantum physics. It turns out that generative AI isn’t yet the magical silver bullet that some hoped—and when it comes to tasks like recognizing audio content or properly labeling images, human intervention and third-party solutions still play a crucial role.

However, Samsung expects labeling budgets to increase 15-20% annually over the next few years as AI continues to need these foundational elements. AI without labeled data is like an artist trying to paint without canvas or brushes—an amusing but ultimately futile endeavor.

As data labeling costs rise, Samsung anticipates a 15-20% annual increase in labeling budgets. Investors would be interested in metrics such as current annual labeling costs, dependency on LLMs, and projected budget growth for labeling to gauge Samsung’s long-term commitment to high-quality data labeling in a Generative AI landscape.

Table 3: Annual Spending and Dependency on LLMs for Data Labeling

Samsung’s Rising Data Labeling Costs in the Generative AI Era

4. Generative AI and Infrastructure: Cloud vs. On-Prem vs. Private Cloud

When it comes to scaling LLMs and generative AI applications, Samsung makes careful decisions between cloud and on-prem solutions. Moving more workloads on-prem may seem counterintuitive in an era of “cloud-first” everything, but for GenAI workloads, this choice often results in better cost control and lower overhead in the long run.

The Director points out that the return on investment is higher when they set up their infrastructure—as long as they’re committed to the game for several more years. Essentially, it’s like buying your own water filtration system instead of paying for bottled water delivery—expensive up front, but cost-effective if you’re planning to stay hydrated.

AWS, GCP, and even players like Lambda Labs and Anthropic have been evaluated for these roles. The balance is key: when to rely on someone else’s cloud, and when to keep the cards close to the chest with on-prem infrastructure?

Samsung’s infrastructure choices reflect a balanced strategy for cost control and scalability. Investors would value metrics like cost savings with on-prem infrastructure, ROI for cloud vs. on-prem setups, and cost per workload across cloud providers to understand Samsung’s rationale for its multi-platform infrastructure.

Table 4: Comparison of Cloud and On-Prem Infrastructure for AI Workloads

Samsung’s AI Infrastructure Strategy: Cloud vs. On-Prem vs. Private Cloud

5. Evaluating the Start-Ups: Tech on the Edge

One of the other factors that differentiate Samsung's approach is their willingness to explore new, exciting startups. Anthropic, Lambda Labs, and LangChain are examples of vendors they are evaluating. They like to get a taste of the new shiny toys in the tech world while keeping a realistic view of where they might fit within the business model.

Their rating for Confluent on Kafka is particularly high at 9.5, signaling their satisfaction with the company. Meanwhile, OpenAI gets an 8—they are good, but customization options are limited, and the costs aren’t trivial. In short, it’s like trying out a luxury car—it’s nice, but not necessarily what you’d use for your daily grocery runs.

Samsung’s willingness to explore new startups gives it a competitive edge. Investors would find metrics like Net Promoter Scores (NPS) for each vendor, customization flexibility, and projected cost impacts of adopting new technology useful for understanding Samsung’s approach to innovation.

Table 5: NPS Scores and Strengths of Leading AI Vendors

Samsung’s Startup Evaluations: Tech on the Edge

Navigating the world of AI, big data, and machine learning vendors is like charting a course through a constantly changing maze. The goals move, the technology evolves, and the options grow—each with its strengths and weaknesses. Samsung has mastered the art of adaptability, relying on a combination of cloud services, on-prem infrastructure, and an ever-evolving lineup of labeling and data processing tools.

In the end, it’s all about balance—finding which vendor fits which use case, allocating budgets without the CFO having a heart attack, and staying nimble in the face of emerging technologies.

Samsung’s AI Strategy: Navigating the Vendor Maze