dagster.io Open in urlscan Pro
76.76.21.21  Public Scan

Submitted URL: http://dagsterlabs.info/
Effective URL: https://dagster.io/
Submission: On July 11 via manual from SG — Scanned from SG

Form analysis 1 forms found in the DOM

Name: dagster_newsleter_subscriptionPOST

<form name="dagster_newsleter_subscription" method="post" class="flex flex-row gap-2" data-hs-cf-bound="true"><input type="email" placeholder="Email" class=" appearance-none rounded-full w-full py-3 px-5 text-gable-green leading-tight bg-gray-100"
    id="email" name="email" required=""><input type="hidden" id="source" name="source" value="/" required="">
  <div class="flex flex-row items-center"><button type="submit" class="full-width rounded-full bg-gable-green hover:bg-gable-green-darker text-white font-bold py-3 px-6 transition duration-150 ease-in-out">Submit</button></div>
  <script src="https://www.google.com/recaptcha/enterprise.js?render=6Lem3u8gAAAAAM8-gfganw3RcSKUKp6gz8GAQwKR"></script>
</form>

Text Content

Introducing Dagster+ the next generation of Dagster Cloud

PlatformDagster+PricingBlogCommunityLearn
Sign in

Join us on Slack

Star us
Try Dagster+


Platform- Integrations- Implementation PartnersDagster+- Pricing- Dagster+ Pro-
Contact SalesBlogCommunityEventsDocsDagster University

--------------------------------------------------------------------------------

Sign inTry Dagster+


SHIP DATA PIPELINES WITH EXTRAORDINARY VELOCITY

Dagster helps data engineers tame complexity. Elevate your data pipelines with
software-defined assets, first-class testing, and deep integration with the
modern data stack.

The cloud-native orchestrator for the whole development lifecycle, with
integrated lineage and observability, a declarative programming model, and
best-in-class testability.

Search the docsPress Ctrl and K to search
Try Dagster+
Learn Dagster


MANAGE YOUR DATA ASSETS WITH CODE

Python assets
dbt-native orchestration
Task-based workflows
from dagster import asset
from pandas import DataFrame, read_html, get_dummies
from sklearn.linear_model import LinearRegression




@asset
def country_populations() -> DataFrame:
df = read_html("https://tinyurl.com/mry64ebh")[0]
df.columns = ["country", "pop2022", "pop2023", "change", "continent", "region"]
df["change"] = df["change"].str.rstrip("%").str.replace("−",
"-").astype("float")
return df


@asset
def continent_change_model(country_populations: DataFrame) -> LinearRegression:
data = country_populations.dropna(subset=["change"])
return LinearRegression().fit(get_dummies(data[["continent"]]), data["change"])


@asset
def continent_stats(country_populations: DataFrame, continent_change_model:
LinearRegression) -> DataFrame:
result = country_populations.groupby("continent").sum()
result["pop_change_factor"] = continent_change_model.coef_
return result


Materialize All
Materializing an asset launches a run and saves the results to persistent
storage. Trigger materializations right from any asset graph.
continent_change_model

Regression of pop change against continent
MaterializedJan 6, 7:39 PM
continent_stats

Statistics for each continent
MaterializedJan 6, 7:40 PM
country_populations

Population data on all countries
MaterializedJan 6, 7:39 PM

from dagster_airbyte import build_airbyte_assets
from dagster_dbt import load_assets_from_dbt_project
from dagster import asset, Definitions
from .constants import AIRBYTE_CONNECTION_ID, DBT_PROJECT_DIR


airbyte_assets = build_airbyte_assets(
connection_id=AIRBYTE_CONNECTION_ID,
destination_tables=["orders", "users"],
asset_key_prefix=["postgres_replica"],
)


dbt_assets = load_assets_from_dbt_project(project_dir=DBT_PROJECT_DIR)


@asset(compute_kind="python")
def order_forecast_model(daily_order_summary: pd.DataFrame) -> np.ndarray:
train_set = daily_order_summary.to_numpy()
xdata, ydata = train_set[:, 0], train_set[:, 2]
return optimize.curve_fit(f=model_func, xdata=xdata, ydata=ydata, p0=[10,
100])[0]


defs = Definitions(
assets=[*airbyte_assets, *dbt_assets, order_forecast_model]
)


Materialize All
Materializing an asset launches a run and saves the results to persistent
storage. Trigger materializations right from any asset graph.
daily_order_summary

Daily metrics for orders placed on this pla...
MaterializedJan 6, 8:18 PM

order_forecast_model

Model parameters that best fit the observ...
MaterializedJan 6, 8:19 PM

orders

Raw order data
MaterializedJan 6, 8:18 PM

orders_cleaned

Filtered version of the raw data
MaterializedJan 6, 8:18 PM

users

Raw user data
MaterializedJan 6, 8:18 PM

users_cleaned

Raw users data augmented with backend ...
MaterializedJan 6, 8:18 PM


from dagster import job, op
from dagster_snowflake import snowflake_resource
from dagster_slack import slack_resource
from .config import SLACK_CONFIG, SNOWFLAKE_CONFIG




@op(required_resource_keys={"snowflake"})
def find_stale_tables(context) -> list[str]:
return [
record[0]
for record in context.resources.snowflake.execute_query(
"""select table_schema || '.' || table_name
from information_schema.tables
where last_altered > dateadd(day, -30, current_timestamp)""",
fetch_results=True,
)
]




@op(required_resource_keys={"slack"})
def post_stale_tables(context, tables: List[str]) -> None:
context.resources.slack.chat_postMessage(
channel="#stale-tables", text="Stale tables:\n" + "\n- ".join(tables)
)




@job(
resource_defs={
"snowflake": snowflake_resource.configured(SNOWFLAKE_CONFIG),
"slack": slack_resource.configured(SLACK_CONFIG),
}
)
def report_stale_tables():
post_stale_tables(find_stale_tables())


[String]
find_stale_tables
Find tables that haven't been updated in the last 30 days
tables
[String]
Any
post_stale_tables
Send a slack message that reports on which tables are stale



A SINGLE PANE OF GLASS FOR YOUR DATA PLATFORM

Monitor execution
Debug runs
Inspect assets
Explore lineage

Monitor runs across all your jobs in one place with the run timeline view.

Dagster overview


Zoom into a run to pin down issues with surgical precision with the run details
view.

Dagster overview

See each asset’s context and update it, all in one place - materializations,
lineage, owner, schema, schedule, partitions, and more.

Dagster overview

Get details on each asset: Freshness, status, schema, metadata, and dependencies
displayed in one consolidated view. Model and organize thousands of assets,
giving you plenty of room to grow.

Dagster overview

Dagster+

From pull request to production. Effortlessly.


THE ENTERPRISE ORCHESTRATION PLATFORM THAT PUTS DEVELOPER EXPERIENCE FIRST, WITH
FULLY SERVERLESS OR HYBRID DEPLOYMENTS, OPERATIONAL OBSERVABILITY, DATA
CATALOGING, AND OUT-OF-THE-BOX CI/CD.


DAGSTER+ BRINGS YOU AN ASSET-ORIENTED APPROACH TO GO WAY BEYOND WHAT TRADITIONAL
ORCHESTRATION DELIVERS.

Try it free for 30 days


DAGSTER POWERS DATA PLATFORMS FOR INNOVATIVE ORGANIZATIONS ALL OVER THE WORLD

Read our users’ success stories



DATA TEAMS FROM STARTUPS TO FORTUNE 500 COMPANIES ALIKE ARE HAVING A BLAST
BUILDING PIPELINES WITH DAGSTER

Join the Slack community

“Dagster has been instrumental in empowering our development team to deliver
insights at 20x the velocity compared to the past. From Idea inception to
Insight is down to 2 days vs 6+ months before.”



Gu Xie



“Dagster Insights has been an invaluable tool for our team. Being able to easily
track Snowflake costs associated with our dbt models has helped us identify
optimization opportunities and reduce our Snowflake costs.”



Timothée Vandeput

Data Engineer

“Dagster is the single pane of glass that our team uses to not only launch and
monitor jobs, but also to surface visibility into data quality, track asset
metadata and lineage, manage testing environments, and even track costs
associated with Dagster and the external services that it manages.”



Zachary Romer



“Somebody magically built the thing I had been envisioning and wanted, and now
it's there and I can use it.”



David Farnan-Williams

Lead Machine Learning Engineer

“Being able to visualize and test changes using branch deployments has enabled
our data team to ship faster”



Aaron Fullerton



“Dagster brings software engineering best practices to a data team that supports
a sprawling organization with minimal footprint.”



Emmanuel Fuentes




INTEGRATIONS

See all

Integrate with the tools you already use and deploy to your infrastructure.





LATEST POSTS

See all
8 Jul 2024

ENABLING DATA QUALITY WITH DAGSTER AND GREAT EXPECTATIONS

Use Dagster and GX to improve data pipeline reliability without writing custom
logic for data testing.


NameMuhammad Jarir KanjiHandle@muhammad
5 Jul 2024

A START-UP’S RITE OF PASSAGE: ESTABLISHING THE DATA PLATFORM

Zippi successfully navigated a common growth milestone, future-proofing data
operations on Dagster.


NameFraser MarlowHandle@frasermarlow
21 Jun 2024

VALUE DRIVEN DATA SCIENCE: THE IMPACT OF DATA SCIENCE ON DATA ORCHESTRATION

Sandy Ryza on the impact of data scientists on the creation of the next
generation of data orchestration tools.


NameSandy RyzaHandle@s_ryz

Join us on Slack

RESOURCES

Platform OverviewIntegrationsDocsBlogData Engineering GlossaryChangelogDagster
Vs. Others

DAGSTER+

AboutDagster+ ProPricingStatusContact Sales

COMMUNITY

Community OverviewImplementation PartnersUpcoming EventsBrowse Discussions


COMPANY

About Careers Brand Assets

GET UPDATES DELIVERED TO YOUR INBOX

Submit


Dagster is an open-source project maintained by Dagster Labs.

Copyright © 2024 Elementl, Inc. d.b.a. Dagster Labs. All rights reserved.

Cookie SettingsPrivacy PolicyTerms of ServiceSecurity

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of
Service apply.