guides.nyu.edu
Open in
urlscan Pro
34.194.39.199
Public Scan
URL:
https://guides.nyu.edu/datascience/literate-prog
Submission Tags: demotag1 demotag2 Search All
Submission: On October 29 via api from IE — Scanned from DE
Submission Tags: demotag1 demotag2 Search All
Submission: On October 29 via api from IE — Scanned from DE
Form analysis
1 forms found in the DOMGET https://guides.nyu.edu/srch.php
<form role="search" id="s-lg-guide-search-form" action="https://guides.nyu.edu/srch.php" method="GET">
<div class="input-group input-group-sm">
<input type="text" id="s-lg-guide-search-terms" name="q" class="form-control" maxlength="260" placeholder="Search this Guide">
<label class="sr-only" for="s-lg-guide-search-terms">Search this Guide</label><input type="hidden" name="guide_id" value="937570"><span class="input-group-btn"><button class="btn btn-default" type="submit">Search</button></span>
</div>
</form>
Text Content
Skip to Main Content 1. NYU Libraries 2. Research Guides 3. Data Science 4. Literate Programming Search this GuideSearch DATA SCIENCE A guide with resources for the data science community on campus. * WELCOME * Starting resources * FINDING DATA * Citing Data * COMPUTE RESOURCES * STORAGE AND BACKUP * PROGRAMMING * Finding Code * Citing Code * Literate Programming * Version Control with Git * Code Publishing ↗ * VISUALIZATION ↗ * SHARING DATA & CODE * Publish articles open access ↗ This link opens in a new window * WORKSHOP CALENDAR * The Carpentries @ NYU ↗ * RESOURCES FOR INSTRUCTORS * MS ORIENTATION ASK YOUR LIBRARIAN! Hello! I am Vicky Rampin, the Librarian for Research Data Management and Reproducibility. I am also the liaison to computer science and data science programs at NYU! I am here to help you navigate the resources for both at NYU and beyond. You can set up an appointment with me or always email me at: vs77@nyu.edu. Meet with Vicky If you need help with a specific quantitative, GIS, or qualitative software, you should reach out to Data Services. RELATED SLIDE DECKS * Intro to Jupyter Notebooks This class is designed for first-time and longer-term users of Jupyter Notebooks, a workspace for writing code. The class focuses on using Notebooks to facilitate sharing and publishing of script workflows. It aims to provide users with knowledge about shortcuts, plugins, and best practices for maximizing re-usability and shareability of Notebook contents. CC Original work in this LibGuide is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. LITERATE PROGRAMMING Donald Knuth first defined literate programming as a script, notebook, or computational document that contains an explanation of the program logic in a natural language (e.g. English or Mandarin), interspersed with snippets of macros and source code, which can be compiled and rerun. You can think of it as an executable paper! No matter which literate programming tool you use, only run the cells from top to bottom – ONLY. The number 1 cause of irreproducible jupyter notebooks is that the original authors run the cells out of order, which can't be reproduced without documentation about which cells in which order. So run your literate programming notebooks from top to bottom only. JUPYTER NOTEBOOKS Jupyter Notebook is an interactive computing environment that enables users to author notebook documents that include code, interactive widgets, plots, narrative text, equations, images and even videos! Jupyter notebooks are heavily used in data science, and it would behoove you to get comfortable with the tool. The jupyter name comes from 3 programming languages: Julia, Python, and R. You can use one programming language per document, and it is done through choosing a kernel (e.g. Python, R, Go, and more -- get the full list of kernels from the wiki). Jupyter notebooks can be comprised mainly of two types of cells (though more can be added with plugins). 1. Markdown Cells (for narratives): when run, a markdown cell will display markdown or HTML that you write (that means all sort of rich content, including images). Essential markdown summary: https://daringfireball.net/projects/markdown/syntax 2. Code Cells (for data cleaning, analysis, visualization, etc.): executable code in a variety of languages, dictated by the kernel (default is Python, but more can be added). Some key jupyter notebook shortcuts to keep in mind while you work: * Use shift + enter to run an active cell * Use esc in highlighted cell to toggle command options: * esc + L - show line numbers * esc + M - format cell as Markdown cell * esc + a - insert cell above current cell * esc + b - insert cell below current cell * Check all current variables: run %whos from a code cell RMARKDOWN RMarkdown is another popular literate programming tool and can be considered an extension of Markdown. Like all literate programming tools, it mixes documentation & code, and not just R code either! You can insert code snippets from other languages (SQL, bash, Python, etc.) into ONE DOCUMENT! Incorporating results directly into your documents is an important step in reproducible research. Any changes that occur in either your data set or the analysis are automatically updated in your document the next time the document is created. Typically, RMarkdown files are edited from within RStudio. The R for Data Science book contains a great chapter on RMarkdown for more information: https://r4ds.had.co.nz/r-markdown.html. * << Previous: Citing Code * Next: Version Control with Git >> * Last Updated: Oct 18, 2024 11:37 AM * URL: https://guides.nyu.edu/datascience * Print Page Author Log-in Report a problem Subjects: Data Science Tags: code license, coding, data management, data license, data science, git, instruction, jupyter, programming, reproducibility, research, teaching Accessibility (opens in new window) Close