www.modular.com
Open in
urlscan Pro
52.17.119.105
Public Scan
Submitted URL: http://www.modular.com/
Effective URL: https://www.modular.com/
Submission: On April 02 via api from US — Scanned from DE
Effective URL: https://www.modular.com/
Submission: On April 02 via api from US — Scanned from DE
Form analysis
1 forms found in the DOMName: email-form — GET
<form id="email-form" name="email-form" data-name="Email Form" method="get" class="perf_opt-inner" data-wf-page-id="65ca59ca70c7535b94cecd16" data-wf-element-id="67b62da9-adf4-db36-d0ac-d94cd878dd29" aria-label="Email Form">
<div class="margin-bottom margin-medium">
<div class="w-layout-vflex perf_opt-models">
<div class="margin-bottom margin-xsmall">
<div class="text-size-metadata">POPULAR ModelS</div>
</div>
<div class="perf_opt-models-mask w-dyn-list">
<div role="list" class="perf_opt-models-inner w-dyn-items">
<div role="listitem" class="w-dyn-item"><label data-type="AMD c6a.16xlarge" data-values-1="" data-values-2="2.2" class="perf_opt-item w-radio is-active"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">Llama2 7b</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AMD c6a.16xlarge" data-values-1="" data-values-2="3.0" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">Mistral 7b</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AMD c6a.16xlarge" data-values-1="" data-values-2="3.3" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">StarCoder 7b</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="Intel c6i.4xlarge" data-values-1="" data-values-2="2.0" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">WavLM Large</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AWS c6g.8xlarge" data-values-1="1.8" data-values-2="4.1" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">Stable Diffusion UNet</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AWS c7g.4xlarge" data-values-1="2.6" data-values-2="3.5" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">RoBERTa Base Seqlen 128</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="Intel c5.4xlarge" data-values-1="2.0" data-values-2="1.5" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">CLIP-ViT Large Patch14</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AMD c5a.8xlarge" data-values-1="4.4" data-values-2="1.5" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">DLRM RMC2</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="Intel c6i.4xlarge" data-values-1="3.1" data-values-2="1.4" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">GPT-2 Small Seqlen 128</span></label></div>
<div role="listitem" class="w-dyn-item"><label data-type="AMD c5a.4xlarge" data-values-1="2.1" data-values-2="1.4" class="perf_opt-item w-radio"><input type="radio" name="models" id="radio-10" data-name="models"
class="w-form-formradioinput hide w-radio-input" value="Radio"><span class="w-form-label" for="radio-10">BERT Large Uncased Seqlen 256</span></label></div>
</div>
</div>
</div>
</div>
</form>
Text Content
Max Platform NEW MAX Platform Learn more Platform MAX Engine MAX Serving Mojo Docs Docs Learn more Docs Home MAX Engine Docs MAX Serving Docs Mojo Docs Company About Learn more Vision Newsletter Culture ModCon Contact Careers Blog Sales LoginSign Up Get StartedSign In Menu FIRST STEP IN Mojo🔥 OPEN SOURCE 🚀 INTRODUCING MAX & ALL THE MODCON '23 ANNOUNCEMENTS READ BLOG MAX PLATFORM ACCELERATES THE PACE OF AI. IT'S PROGRAMMABLE We rebuilt the modern AI software stack, from the ground up, to boost any AI pipeline, on any hardware. Get started Learn More TRUSTED BY ORGANIZATIONS 01 BENEFITS PROGRAMMABLE, PERFORMANT & PORTABLE FULL PROGRAMMABILITY MAX is built on top of Mojo from the ground up to empower AI engineers to unlock the full potential of AI hardware by combining the usability of Python, the safety of Rust, and the performance of C. UNPARALLELED PERFORMANCE MAX unlocks state-of-the-art performance for your AI models. Extend and optimize your AI pipelines without having to rewrite them, with unparalleled performance using a next generation compiler. SEAMLESS PORTABILITY Seamlessly move your models and AI pipelines to any hardware target, maximizing your performance to cost ratio and avoiding vendor lock-in. 02 PERFORMANCE UNPARALLELED LATENCY & COST SAVINGS MAX unlocks state-of-the-art latency and throughput for your AI pipeline, including generative models, helping you quickly productionize AI pipelines and realize massive cost savings on your cloud bill. POPULAR ModelS Llama2 7b Mistral 7b StarCoder 7b WavLM Large Stable Diffusion UNet RoBERTa Base Seqlen 128 CLIP-ViT Large Patch14 DLRM RMC2 GPT-2 Small Seqlen 128 BERT Large Uncased Seqlen 256 Thank you! Your submission has been received! Oops! Something went wrong while submitting the form. 1.7 x vs Modular is 1.7x faster than TensorFlow when running [Stable Diffusion-UNet] on [CPU] 2.2 x vs Modular is 2.2x faster than PyTorch when running Llama2 7b on AMD c6a.16xlarge Do these numbers seem too good to be true? View in more detail, then sign up to compare locally. Explore our performance 03 PRODUCT AN INTEGRATED AI DEVELOPER EXPERIENCE The Modular Accelerated Xecution (MAX) platform is a unified set of tools and libraries that provides everything you need to deploy low-latency, high-throughput, real-time AI inference pipelines into production. MAX COMPONENTS * MOJO A programming language that combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models for all AI engineers. Learn about Mojo * Mojo Docs * Mojo Community * MAX ENGINE A model inference runtime and API library that executes all your AI pipelines on any hardware with unparalleled performance and cost savings. Learn about MAX Engine * MAX Engine Docs * MAX Engine Github Repo * MAX SERVING A model serving library for the MAX Engine that provides full interoperability with existing serving systems (e.g. Triton) and that seamlessly deploys within existing container infrastructure (e.g., Kubernetes). Learn about MAX Serving * MAX Serving Docs * Get Started 04 USE CASES INCREDIBLY EASY TO GET STARTED from max import engine # Load your model session = engine.InferenceSession() model = session.load(MODEL_PATH) # Prepare the inputs, then run an inference outputs = model.execute(**inputs) from max.graph import Dim, Module, MOTensor @value struct LLM: var params: ModelParams fn build(inout self, inout m: Module): var g = m.graph("llm", TypeTuple(MOTensor( DType.float32, Dim.dynamic(), Dim.dynamic()) ) ) ... g.output((reshape( next_token, self.batch ))) from max.engine import InferenceSession var sess = InferenceSession() var txt_enc = sess.load_model('txt-encoder') var img_dec = sess.load_model('img-decoder') var img_dif = sess.load_model('img-diffuser') var latent = ... for step in range(n_steps): var prev = latent var latent = execute(img_dif, latent) var pred = ... latent = ... var decoded = execute(img_dec, latent) var pixels = decoded.to_numpy() var img = Image.fromarray(pixels, 'RGB') QUICK PERFORMANCE WINS Use our Python or C API to replace your current TensorFlow, PyTorch, or ONNX inference calls with MAX Engine. With 3 lines of code you can execute your AI models up to 5x faster across a variety of CPU architectures (Intel, AMD, ARM). Additionally, use MAX Serving as a drop-in replacement for your NVIDIA Triton Inference Server. EXTEND & OPTIMIZE YOUR MODELS Once you're using MAX Engine, you can optimize your performance further by using Mojo to write custom ops or build your whole model in Mojo, using the MAX Graph API (for inference). FULL STACK ON MAX Beyond inference performance in MAX Engine, you can further optimize the rest of your AI pipeline by migrating your data pre/post-processing code and application code to Mojo. Over time, we will add more tools and libraries to MAX that accelerate development for other parts of your AI stack. Get started from max import engine # Load your model session = engine.InferenceSession() model = session.load(MODEL_PATH) # Prepare the inputs, then run an inference outputs = model.execute(**inputs) QUICK PERFORMANCE WINS Use our Python or C API to replace your current TensorFlow, PyTorch, or ONNX inference calls with MAX Engine. With 3 lines of code you can execute your AI models up to 5x faster across a variety of CPU architectures (Intel, AMD, ARM). Additionally, use MAX Serving as a drop-in replacement for your NVIDIA Triton Inference Server. from max.graph import Dim, Module, MOTensor @value struct LLM: var params: ModelParams fn build(inout self, inout m: Module): var g = m.graph("llm", TypeTuple(MOTensor( DType.float32, Dim.dynamic(), Dim.dynamic()) ) ) ... g.output((reshape( next_token, self.batch ))) EXTEND & OPTIMIZE YOUR MODELS Once you're using MAX Engine, you can optimize your performance further by using Mojo to write custom ops or build your whole model in Mojo, using the MAX Graph API (for inference). from max.engine import InferenceSession var sess = InferenceSession() var txt_enc = sess.load_model('txt-encoder') var img_dec = sess.load_model('img-decoder') var img_dif = sess.load_model('img-diffuser') var latent = ... for step in range(n_steps): var prev = latent var latent = execute(img_dif, latent) var pred = ... latent = ... var decoded = execute(img_dec, latent) var pixels = decoded.to_numpy() var img = Image.fromarray(pixels, 'RGB') FULL STACK ON MAX Beyond inference performance in MAX Engine, you can further optimize the rest of your AI pipeline by migrating your data pre/post-processing code and application code to Mojo. Over time, we will add more tools and libraries to MAX that accelerate development for other parts of your AI stack. Try it LATEST ABOUT MODULAR Developer THE NEXT BIG STEP IN MOJO🔥 OPEN SOURCE March 28, 2024 Developer LEVERAGING MAX ENGINE'S DYNAMIC SHAPE CAPABILITIES March 28, 2024 Product MAX 24.2 IS HERE! WHAT’S NEW? March 28, 2024 Developer DEPLOYING MAX ON AMAZON SAGEMAKER March 27, 2024 Developer SEMANTIC SEARCH WITH MAX ENGINE March 21, 2024 Engineering HOW TO BE CONFIDENT IN YOUR PERFORMANCE BENCHMARKING March 19, 2024 WHY MODULAR? 01 BUILT BY THE WORLD’S AI EXPERTS Our team has built most of the world’s existing AI infrastructure, including TensorFlow, PyTorch, ONNX, and XLA, and we’ve built and scaled dev tools like Swift, LLVM, and MLIR. Now we’re focused on rebuilding AI infrastructure for the world. 02 REINVENTED FROM THE GROUND UP To unlock the next wave of AI innovation, we started with a “first principles” approach to building the lowest layers of the AI stack. We can’t pile on more and more layers of complexity on top of already over-complicated existing solutions. 03 INFRASTRUCTURE THAT JUST WORKS We build technology that meets you where you are. We don’t require you to rewrite your models, workflows, or application code, grapple with confusing converters, or be a hardware expert to take advantage of bleeding-edge technology. TRY MAX RIGHT NOW Up and running, for free, in 5 minutes. Get started Book a demo MAX Platform MAX Engine 🏎️ MAX Serving ⚡️ Mojo 🔥 Sign Up Blog Careers Report an issue Copyright © 2024 Modular Inc Terms , Privacy & Acceptable Use Please accept our cookies We use cookies to track visitor traffic so we can learn to improve the website and documentation. Read more AcceptReject Cookie preferences Cookie usage The Modular Docs website uses browser cookies only to track website traffic with Google Analytics. For more details about how we handle sensitive data, please read our privacy policy. Google Analytics cookiesGoogle Analytics cookies These cookies track website usage and are unique to this website. NameDomainExpirationDescription^_gagoogle.com2 yearsGoogle Analytics Accept allReject allSave settings