Cloud provider comparison

Cloud - AWS , Azure , GCP for LLM

here are several key differentiators between public cloud providers when it comes to AI workloads and large language models:

Compute power - The major clouds offer GPUs, TPUs, and high-performance computing for training large AI models. But their capabilities differ:

AWS provides scale with its P2, P3, and elastic GPU instances. It has thousands of GPUs and partnerships with NVIDIA.
Google Cloud offers TPUs (Tensor Processing Units) which can significantly accelerate certain types of AI training compared to GPUs alone. They have already trained some of the largest language models like BERT and T5 using TPUs.
Azure also provides many GPU and high-performance instances. They aim to provide the "long tail" of compute for specialized AI by offering FPGAs, Intel Habana Gaudi accelerators, and partnerships with Graphcore.

AI specific services - Each cloud is building higher-level services for machine learning workflows, MLOps, and language models:

AWS SageMaker provides a platform to build, train, and deploy ML models at scale. They have tuned it for large language model training.
Google Cloud AI Platform is a managed platform for enterprise ML pipelines and use cases. They have excellent tools for companies training their own large language models.
Azure Machine Learning is Microsoft's managed AI service. They include some specialized support for natural language processing and language model training at the infrastructure level.

Partnerships - The clouds are partnering with key AI companies for access to expertise, data, and technology:

AWS has close partnerships with Anthropic, Databricks, and Hugging Face.
Google Cloud partners with companies like Anthropic, Automation Anywhere, and H2O.ai.
Azure partners include Anthropic, Baidu, LightTag, and Nuance Communications.

AI talent - The major clouds are hiring many experts in NLP, deep learning, and MLOps to help design and improve their AI and language model services. The depth of AI knowledge and experience is still developing and differs between the providers.

Google has arguably the strongest concentration of AI talent due its research arm DeepMind and long history of technical AI expertise.
AWS and Microsoft have been expanding their AI talent through hiring and acquisitions. But still have more to do to match the depth of Google's internal expertise.

In summary, while all major public clouds are strong on AI, they differ substantially in compute power, managed services, partnerships, and talent. The choice depends on your priorities for training data access, budget, specialized AI needs, integration with open-source projects, and more. No single cloud is superior for all types of AI workloads and language models.

How do the public cloud providers compare in terms of data access for training AI models?

Public cloud providers offer different data access options for training AI models:

AWS:

• AWS Marketplace contains many public datasets that can be used for model training, like ImageNet, iNaturalist, and Reddit. But no private training datasets.

• AWS Data Exchange allows third parties to publish training data that can be subscribed to. But the selection is still limited. Mostly focused on financial, healthcare and mobility data so far.

• AWS does not provide any proprietary training data of their own since they are not a data company. You need to provide your own training data.

• AWS has partnerships with data brokers like Refinitiv and Foursquare to provide data access to customers. But partnerships are still limited.

• AWS Lake Formation makes it easier to gather, catalog, transform and manage your own data for model training. But you still need to have access to substantial data from other sources.

Google Cloud:

• Google Cloud also has public datasets available for training like Kaggle Datasets, AI Hub, etc. But no private training datasets.

• Google does have agreements with some healthcare organizations to use their data for training models. But access is very limited.

• Google's strength is Google-sourced data (Search, Maps, YouTube, etc.) used internally for model training. But this data is not directly accessible to Google Cloud customers due to privacy concerns.

• Google Cloud's Vertex AI helps simplify the process of gathering, labeling, and managing your own training data to build models. But you still need data access from elsewhere.

• Google Cloud has few partnerships so far providing data access. But they do have a strong Expert Labeling Service for annotating your own datasets.

Azure:

• Azure Marketplace also contains public datasets for training like CelebA, Reddit, and more. But no private training datasets.

• Azure has partnerships with data providers like Refinitiv, Foursquare, and others. But fewer data partnerships so far compared to rivals. Microsoft is still building them up.

• Azure's strength is also with their own data (Bing, Office, LinkedIn, etc.). But little of this data is available for customer model training due to privacy issues.

• Azure services like MLOps help you manage the process of readying your own training datasets to build models. But you need to source the data elsewhere.

• A few healthcare partnerships provide access to training data. But far less healthcare data access than Google currently has.

Overall, while the major public clouds provide tooling to simplify training data sourcing and management, they rely primarily on the customer to provide the data for building AI models. Proprietary data access is limited due to competitive and privacy concerns. Partnerships to access third-party data are growing but still narrow. There are some differences in data access for specific verticals like healthcare though. But in general, you cannot rely solely on a public cloud provider's native data offerings for robust AI training. You need data access and partnerships of your own.

What are some of the AI capabilities that Google has that Microsoft doesn't?

Google has several AI capabilities that are more advanced than Microsoft's currently:

General AI research - Google's DeepMind subsidiary is one of the world leaders in general artificial intelligence research. They have made significant progress in areas like reinforcement learning, game theory, and reasoning. DeepMind's AlphaGo program beating Lee Sedol at Go in 2016 was a landmark achievement. Microsoft does not have a comparable general AI research lab.
Custom AI chips - Google develops its own custom AI chips called Tensor Processing Units or TPUs. These chips are designed specifically for neural network machine learning and can provide significant speedups over GPUs. Microsoft currently still relies on NVIDIA GPUs and Intel CPUs for most of its AI workloads.
Robotics - Google has made more progress in physical AI systems like robotics. They acquired companies like Boston Dynamics, Schaft, and others doing advanced robotics work. Microsoft's AI has mostly focused on software algorithms and virtual assistants so far. Robotics requires specialized hardware and knowledge that Microsoft has less expertise in
Digital assistant - Google Assistant is generally considered ahead of Microsoft's Cortana digital assistant in its AI capabilities. Google Assistant leads in areas such as natural language understanding, number of supported devices, and third-party integrations. Cortana has struggled to gain as much traction so far.
Healthcare AI - Google AI has made strong progress applying AI to healthcare with tools like DeepMind Health and Google Brain. Microsoft has some efforts in this space but lacks Google's level of expertise and investment so far. Healthcare AI requires specialized training data and algorithms that represent the unique challenges of the field.
Applications of AI - In general, Google has built more applications of AI into their products - things like image search, voice search, spam filtering, content recommendation, and so on. Microsoft is making a bigger push now to build AI into applications and services, but still has more to do to match Google.