dagster.io
Open in
urlscan Pro
76.76.21.21
Public Scan
Submitted URL: http://dagsterlabs.info/
Effective URL: https://dagster.io/
Submission: On July 11 via manual from SG — Scanned from SG
Effective URL: https://dagster.io/
Submission: On July 11 via manual from SG — Scanned from SG
Form analysis
1 forms found in the DOMName: dagster_newsleter_subscription — POST
<form name="dagster_newsleter_subscription" method="post" class="flex flex-row gap-2" data-hs-cf-bound="true"><input type="email" placeholder="Email" class=" appearance-none rounded-full w-full py-3 px-5 text-gable-green leading-tight bg-gray-100"
id="email" name="email" required=""><input type="hidden" id="source" name="source" value="/" required="">
<div class="flex flex-row items-center"><button type="submit" class="full-width rounded-full bg-gable-green hover:bg-gable-green-darker text-white font-bold py-3 px-6 transition duration-150 ease-in-out">Submit</button></div>
<script src="https://www.google.com/recaptcha/enterprise.js?render=6Lem3u8gAAAAAM8-gfganw3RcSKUKp6gz8GAQwKR"></script>
</form>
Text Content
Introducing Dagster+ the next generation of Dagster Cloud PlatformDagster+PricingBlogCommunityLearn Sign in Join us on Slack Star us Try Dagster+ Platform- Integrations- Implementation PartnersDagster+- Pricing- Dagster+ Pro- Contact SalesBlogCommunityEventsDocsDagster University -------------------------------------------------------------------------------- Sign inTry Dagster+ SHIP DATA PIPELINES WITH EXTRAORDINARY VELOCITY Dagster helps data engineers tame complexity. Elevate your data pipelines with software-defined assets, first-class testing, and deep integration with the modern data stack. The cloud-native orchestrator for the whole development lifecycle, with integrated lineage and observability, a declarative programming model, and best-in-class testability. Search the docsPress Ctrl and K to search Try Dagster+ Learn Dagster MANAGE YOUR DATA ASSETS WITH CODE Python assets dbt-native orchestration Task-based workflows from dagster import asset from pandas import DataFrame, read_html, get_dummies from sklearn.linear_model import LinearRegression @asset def country_populations() -> DataFrame: df = read_html("https://tinyurl.com/mry64ebh")[0] df.columns = ["country", "pop2022", "pop2023", "change", "continent", "region"] df["change"] = df["change"].str.rstrip("%").str.replace("−", "-").astype("float") return df @asset def continent_change_model(country_populations: DataFrame) -> LinearRegression: data = country_populations.dropna(subset=["change"]) return LinearRegression().fit(get_dummies(data[["continent"]]), data["change"]) @asset def continent_stats(country_populations: DataFrame, continent_change_model: LinearRegression) -> DataFrame: result = country_populations.groupby("continent").sum() result["pop_change_factor"] = continent_change_model.coef_ return result Materialize All Materializing an asset launches a run and saves the results to persistent storage. Trigger materializations right from any asset graph. continent_change_model Regression of pop change against continent MaterializedJan 6, 7:39 PM continent_stats Statistics for each continent MaterializedJan 6, 7:40 PM country_populations Population data on all countries MaterializedJan 6, 7:39 PM from dagster_airbyte import build_airbyte_assets from dagster_dbt import load_assets_from_dbt_project from dagster import asset, Definitions from .constants import AIRBYTE_CONNECTION_ID, DBT_PROJECT_DIR airbyte_assets = build_airbyte_assets( connection_id=AIRBYTE_CONNECTION_ID, destination_tables=["orders", "users"], asset_key_prefix=["postgres_replica"], ) dbt_assets = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR) @asset(compute_kind="python") def order_forecast_model(daily_order_summary: pd.DataFrame) -> np.ndarray: train_set = daily_order_summary.to_numpy() xdata, ydata = train_set[:, 0], train_set[:, 2] return optimize.curve_fit(f=model_func, xdata=xdata, ydata=ydata, p0=[10, 100])[0] defs = Definitions( assets=[*airbyte_assets, *dbt_assets, order_forecast_model] ) Materialize All Materializing an asset launches a run and saves the results to persistent storage. Trigger materializations right from any asset graph. daily_order_summary Daily metrics for orders placed on this pla... MaterializedJan 6, 8:18 PM order_forecast_model Model parameters that best fit the observ... MaterializedJan 6, 8:19 PM orders Raw order data MaterializedJan 6, 8:18 PM orders_cleaned Filtered version of the raw data MaterializedJan 6, 8:18 PM users Raw user data MaterializedJan 6, 8:18 PM users_cleaned Raw users data augmented with backend ... MaterializedJan 6, 8:18 PM from dagster import job, op from dagster_snowflake import snowflake_resource from dagster_slack import slack_resource from .config import SLACK_CONFIG, SNOWFLAKE_CONFIG @op(required_resource_keys={"snowflake"}) def find_stale_tables(context) -> list[str]: return [ record[0] for record in context.resources.snowflake.execute_query( """select table_schema || '.' || table_name from information_schema.tables where last_altered > dateadd(day, -30, current_timestamp)""", fetch_results=True, ) ] @op(required_resource_keys={"slack"}) def post_stale_tables(context, tables: List[str]) -> None: context.resources.slack.chat_postMessage( channel="#stale-tables", text="Stale tables:\n" + "\n- ".join(tables) ) @job( resource_defs={ "snowflake": snowflake_resource.configured(SNOWFLAKE_CONFIG), "slack": slack_resource.configured(SLACK_CONFIG), } ) def report_stale_tables(): post_stale_tables(find_stale_tables()) [String] find_stale_tables Find tables that haven't been updated in the last 30 days tables [String] Any post_stale_tables Send a slack message that reports on which tables are stale A SINGLE PANE OF GLASS FOR YOUR DATA PLATFORM Monitor execution Debug runs Inspect assets Explore lineage Monitor runs across all your jobs in one place with the run timeline view. Dagster overview Zoom into a run to pin down issues with surgical precision with the run details view. Dagster overview See each asset’s context and update it, all in one place - materializations, lineage, owner, schema, schedule, partitions, and more. Dagster overview Get details on each asset: Freshness, status, schema, metadata, and dependencies displayed in one consolidated view. Model and organize thousands of assets, giving you plenty of room to grow. Dagster overview Dagster+ From pull request to production. Effortlessly. THE ENTERPRISE ORCHESTRATION PLATFORM THAT PUTS DEVELOPER EXPERIENCE FIRST, WITH FULLY SERVERLESS OR HYBRID DEPLOYMENTS, OPERATIONAL OBSERVABILITY, DATA CATALOGING, AND OUT-OF-THE-BOX CI/CD. DAGSTER+ BRINGS YOU AN ASSET-ORIENTED APPROACH TO GO WAY BEYOND WHAT TRADITIONAL ORCHESTRATION DELIVERS. Try it free for 30 days DAGSTER POWERS DATA PLATFORMS FOR INNOVATIVE ORGANIZATIONS ALL OVER THE WORLD Read our users’ success stories DATA TEAMS FROM STARTUPS TO FORTUNE 500 COMPANIES ALIKE ARE HAVING A BLAST BUILDING PIPELINES WITH DAGSTER Join the Slack community “Dagster has been instrumental in empowering our development team to deliver insights at 20x the velocity compared to the past. From Idea inception to Insight is down to 2 days vs 6+ months before.” Gu Xie “Dagster Insights has been an invaluable tool for our team. Being able to easily track Snowflake costs associated with our dbt models has helped us identify optimization opportunities and reduce our Snowflake costs.” Timothée Vandeput Data Engineer “Dagster is the single pane of glass that our team uses to not only launch and monitor jobs, but also to surface visibility into data quality, track asset metadata and lineage, manage testing environments, and even track costs associated with Dagster and the external services that it manages.” Zachary Romer “Somebody magically built the thing I had been envisioning and wanted, and now it's there and I can use it.” David Farnan-Williams Lead Machine Learning Engineer “Being able to visualize and test changes using branch deployments has enabled our data team to ship faster” Aaron Fullerton “Dagster brings software engineering best practices to a data team that supports a sprawling organization with minimal footprint.” Emmanuel Fuentes INTEGRATIONS See all Integrate with the tools you already use and deploy to your infrastructure. LATEST POSTS See all 8 Jul 2024 ENABLING DATA QUALITY WITH DAGSTER AND GREAT EXPECTATIONS Use Dagster and GX to improve data pipeline reliability without writing custom logic for data testing. NameMuhammad Jarir KanjiHandle@muhammad 5 Jul 2024 A START-UP’S RITE OF PASSAGE: ESTABLISHING THE DATA PLATFORM Zippi successfully navigated a common growth milestone, future-proofing data operations on Dagster. NameFraser MarlowHandle@frasermarlow 21 Jun 2024 VALUE DRIVEN DATA SCIENCE: THE IMPACT OF DATA SCIENCE ON DATA ORCHESTRATION Sandy Ryza on the impact of data scientists on the creation of the next generation of data orchestration tools. NameSandy RyzaHandle@s_ryz Join us on Slack RESOURCES Platform OverviewIntegrationsDocsBlogData Engineering GlossaryChangelogDagster Vs. Others DAGSTER+ AboutDagster+ ProPricingStatusContact Sales COMMUNITY Community OverviewImplementation PartnersUpcoming EventsBrowse Discussions COMPANY About Careers Brand Assets GET UPDATES DELIVERED TO YOUR INBOX Submit Dagster is an open-source project maintained by Dagster Labs. Copyright © 2024 Elementl, Inc. d.b.a. Dagster Labs. All rights reserved. Cookie SettingsPrivacy PolicyTerms of ServiceSecurity This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.