www.pluralsight.com Open in urlscan Pro
2606:4700::6811:2355  Public Scan

Submitted URL: https://connect.pluralsight.com/dc/l6WRhHKnwa3LcpUsJiJZ-KNupKrVAFT9E5p1RhihJqW5Ijy1l2jnfCBnXkLCL9zlXM10Ps5UjCnOf3GNHGqqjtRGEE4GA...
Effective URL: https://www.pluralsight.com/resources/blog/data/how-build-large-language-model?utm_source=marketo&utm_medium=email&utm_campa...
Submission Tags: urlscan
Submission: On March 21 via api from US — Scanned from DE

Form analysis 4 forms found in the DOM

<form class="header-search-form">
  <input class="header-search-input" type="text" name="q" placeholder="What do you want to learn?" autocomplete="off">
</form>

<form class="header-search-form -flex-and-center">
  <input class="header-search-input flex-1" type="text" name="q" placeholder="Search" autocomplete="off">
  <button type="submit">
    <svg width="24" height="24" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg" role="button">
      <title>A search magnifying glass</title>
      <mask id="mask0_3541_6692" style="mask-type:luminance" maskUnits="userSpaceOnUse" x="2" y="2" width="20" height="20">
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M21.3534 19.9404L21.3535 19.9404L16.314 14.9C17.403 13.504 18 11.799 18 10C18 7.863 17.167 5.856 15.656 4.344C14.145 2.832 12.137 2 10 2C7.863 2 5.854 2.832 4.344 4.344C2.832 5.856 2 7.863 2 10C2 12.137 2.832 14.146 4.344 15.656C5.854 17.168 7.863 18 10 18C11.799 18 13.504 17.404 14.9 16.315L19.9394 21.3544C20.1347 21.5497 20.4513 21.5497 20.6466 21.3544L21.3534 20.6476L21.3534 20.6475C21.5487 20.4523 21.5487 20.1357 21.3534 19.9404ZM14.242 14.243C13.109 15.376 11.602 16 10 16C8.397 16 6.891 15.376 5.758 14.243C4.624 13.11 4 11.603 4 10C4 8.398 4.624 6.891 5.758 5.758C6.891 4.624 8.397 4 10 4C11.602 4 13.109 4.624 14.242 5.758C15.376 6.891 16 8.398 16 10C16 11.603 15.376 13.11 14.242 14.243Z"
          fill="white"></path>
      </mask>
      <g mask="url(#mask0_3541_6692)">
        <rect width="24" height="24" fill="#A5AACF"></rect>
      </g>
    </svg>
  </button>
</form>

<form id="customMarketo_1298" data-mkto-id="1298">
  <div class="marketo-form-field">
    <label for="FirstName" class="mrkto_text_lbl">First Name<span class="requiredAsterix">*</span></label>
    <input type="text" id="1298_FirstName" class="mrkto_text" name="FirstName" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="LastName" class="mrkto_text_lbl">Last Name<span class="requiredAsterix">*</span></label>
    <input type="text" id="1298_LastName" class="mrkto_text" name="LastName" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="Email" class="mrkto_text_lbl">Email Address<span class="requiredAsterix">*</span></label>
    <input type="email" id="1298_Email" class="mrkto_email" name="Email" required="" oninvalid="setCustomValidity('Must be valid email. example@yourdomain.com)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="Company" class="mrkto_text_lbl">Company<span class="requiredAsterix">*</span></label>
    <input type="text" id="1298_Company" class="mrkto_text" name="Company" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="Title" class="mrkto_text_lbl">Job Title<span class="requiredAsterix">*</span></label>
    <input type="text" id="1298_Title" class="mrkto_text" name="Title" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="Phone" class="mrkto_text_lbl">Phone<span class="requiredAsterix">*</span></label>
    <input type="text" id="1298_Phone" class="mrkto_text" name="Phone" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <label for="Country" class="mrkto_select_lbl">Country<span class="requiredAsterix">*</span></label>
    <select id="1298_Country" class="mrkto_select" name="Country" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')">
      <option value="">Select...</option>
      <option value="United States">United States</option>
      <option value="Afghanistan">Afghanistan</option>
      <option value="Aland Islands">Aland Islands</option>
      <option value="Albania">Albania</option>
      <option value="Algeria">Algeria</option>
      <option value="American Samoa">American Samoa</option>
      <option value="Andorra">Andorra</option>
      <option value="Angola">Angola</option>
      <option value="Anguilla">Anguilla</option>
      <option value="Antarctica">Antarctica</option>
      <option value="Antigua and Barbuda">Antigua and Barbuda</option>
      <option value="Argentina">Argentina</option>
      <option value="Armenia">Armenia</option>
      <option value="Aruba">Aruba</option>
      <option value="Australia">Australia</option>
      <option value="Austria">Austria</option>
      <option value="Azerbaijan">Azerbaijan</option>
      <option value="Bahamas">Bahamas</option>
      <option value="Bahrain">Bahrain</option>
      <option value="Bangladesh">Bangladesh</option>
      <option value="Barbados">Barbados</option>
      <option value="Belarus">Belarus</option>
      <option value="Belgium">Belgium</option>
      <option value="Belize">Belize</option>
      <option value="Benin">Benin</option>
      <option value="Bermuda">Bermuda</option>
      <option value="Bhutan">Bhutan</option>
      <option value="Bolivia">Bolivia</option>
      <option value="Bonaire, Saint Eustatius and Saba">Bonaire, Saint Eustatius and Saba</option>
      <option value="Bosnia and Herzegovina">Bosnia and Herzegovina</option>
      <option value="Botswana">Botswana</option>
      <option value="Bouvet Island">Bouvet Island</option>
      <option value="Brazil">Brazil</option>
      <option value="British Indian Ocean Territory">British Indian Ocean Territory</option>
      <option value="Brunei Darussalam">Brunei Darussalam</option>
      <option value="Bulgaria">Bulgaria</option>
      <option value="Burkina Faso">Burkina Faso</option>
      <option value="Burundi">Burundi</option>
      <option value="Cambodia">Cambodia</option>
      <option value="Cameroon">Cameroon</option>
      <option value="Canada">Canada</option>
      <option value="Cape Verde">Cape Verde</option>
      <option value="Cayman Islands">Cayman Islands</option>
      <option value="Central African Republic">Central African Republic</option>
      <option value="Chad">Chad</option>
      <option value="Chile">Chile</option>
      <option value="China">China</option>
      <option value="Christmas Island">Christmas Island</option>
      <option value="Cocos (Keeling) Islands">Cocos (Keeling) Islands</option>
      <option value="Colombia">Colombia</option>
      <option value="Comoros">Comoros</option>
      <option value="Congo">Congo</option>
      <option value="Congo the Democratic Republic of the">Democratic Republic of the Congo</option>
      <option value="Cook Islands">Cook Islands</option>
      <option value="Costa Rica">Costa Rica</option>
      <option value="Cote d'Ivoire">Cote d'Ivoire</option>
      <option value="Croatia">Croatia</option>
      <option value="Cuba">Cuba</option>
      <option value="Curacao">Curacao</option>
      <option value="Cyprus">Cyprus</option>
      <option value="Czech Republic">Czech Republic</option>
      <option value="Denmark">Denmark</option>
      <option value="Djibouti">Djibouti</option>
      <option value="Dominica">Dominica</option>
      <option value="Dominican Republic">Dominican Republic</option>
      <option value="Ecuador">Ecuador</option>
      <option value="Egypt">Egypt</option>
      <option value="El Salvador">El Salvador</option>
      <option value="Equatorial Guinea">Equatorial Guinea</option>
      <option value="Eritrea">Eritrea</option>
      <option value="Estonia">Estonia</option>
      <option value="Ethiopia">Ethiopia</option>
      <option value="Falkland Islands (Malvinas)">Falkland Islands (Malvinas)</option>
      <option value="Faroe Islands">Faroe Islands</option>
      <option value="Fiji">Fiji</option>
      <option value="Finland">Finland</option>
      <option value="France">France</option>
      <option value="French Guiana">French Guiana</option>
      <option value="French Polynesia">French Polynesia</option>
      <option value="French Southern Territories">French Southern Territories</option>
      <option value="Gabon">Gabon</option>
      <option value="Gambia">Gambia</option>
      <option value="Georgia">Georgia</option>
      <option value="Germany">Germany</option>
      <option value="Ghana">Ghana</option>
      <option value="Gibraltar">Gibraltar</option>
      <option value="Greece">Greece</option>
      <option value="Greenland">Greenland</option>
      <option value="Grenada">Grenada</option>
      <option value="Guadeloupe">Guadeloupe</option>
      <option value="Guam">Guam</option>
      <option value="Guatemala">Guatemala</option>
      <option value="Guernsey">Guernsey</option>
      <option value="Guinea">Guinea</option>
      <option value="Guinea-Bissau">Guinea-Bissau</option>
      <option value="Guyana">Guyana</option>
      <option value="Haiti">Haiti</option>
      <option value="Heard Island and McDonald Islands">Heard Island and McDonald Islands</option>
      <option value="Holy See (Vatican City State)">Holy See (Vatican City State)</option>
      <option value="Honduras">Honduras</option>
      <option value="Hong Kong">Hong Kong</option>
      <option value="Hungary">Hungary</option>
      <option value="Iceland">Iceland</option>
      <option value="India">India</option>
      <option value="Indonesia">Indonesia</option>
      <option value="Iran">Iran</option>
      <option value="Iraq">Iraq</option>
      <option value="Ireland">Ireland</option>
      <option value="Isle of Man">Isle of Man</option>
      <option value="Israel">Israel</option>
      <option value="Italy">Italy</option>
      <option value="Jamaica">Jamaica</option>
      <option value="Japan">Japan</option>
      <option value="Jersey">Jersey</option>
      <option value="Jordan">Jordan</option>
      <option value="Kazakhstan">Kazakhstan</option>
      <option value="Kenya">Kenya</option>
      <option value="Kiribati">Kiribati</option>
      <option value="Korea, Republic of">Korea</option>
      <option value="Kosovo">Kosovo</option>
      <option value="Kuwait">Kuwait</option>
      <option value="Kyrgyzstan">Kyrgyzstan</option>
      <option value="Lao People's Democratic Republic">Lao People's Democratic Republic</option>
      <option value="Latvia">Latvia</option>
      <option value="Lebanon">Lebanon</option>
      <option value="Lesotho">Lesotho</option>
      <option value="Liberia">Liberia</option>
      <option value="Libyan Arab Jamahiriya">Libyan Arab Jamahiriya</option>
      <option value="Liechtenstein">Liechtenstein</option>
      <option value="Lithuania">Lithuania</option>
      <option value="Luxembourg">Luxembourg</option>
      <option value="Macao">Macao</option>
      <option value="Macedonia, the Former Yugoslav Republic of">Republic of Macedonia</option>
      <option value="Madagascar">Madagascar</option>
      <option value="Malawi">Malawi</option>
      <option value="Malaysia">Malaysia</option>
      <option value="Maldives">Maldives</option>
      <option value="Mali">Mali</option>
      <option value="Malta">Malta</option>
      <option value="Marshall Islands">Marshall Islands</option>
      <option value="Martinique">Martinique</option>
      <option value="Mauritania">Mauritania</option>
      <option value="Mauritius">Mauritius</option>
      <option value="Mayotte">Mayotte</option>
      <option value="Mexico">Mexico</option>
      <option value="Micronesia, Federated States of">Federated States of Micronesia</option>
      <option value="Moldova, Republic of">Republic of Moldova</option>
      <option value="Monaco">Monaco</option>
      <option value="Mongolia">Mongolia</option>
      <option value="Montenegro">Montenegro</option>
      <option value="Montserrat">Montserrat</option>
      <option value="Morocco">Morocco</option>
      <option value="Mozambique">Mozambique</option>
      <option value="Myanmar">Myanmar</option>
      <option value="Namibia">Namibia</option>
      <option value="Nauru">Nauru</option>
      <option value="Nepal">Nepal</option>
      <option value="Netherlands">Netherlands</option>
      <option value="Netherlands Antilles">Netherlands Antilles</option>
      <option value="New Caledonia">New Caledonia</option>
      <option value="New Zealand">New Zealand</option>
      <option value="Nicaragua">Nicaragua</option>
      <option value="Niger">Niger</option>
      <option value="Nigeria">Nigeria</option>
      <option value="Niue">Niue</option>
      <option value="Norfolk Island">Norfolk Island</option>
      <option value="Northern Mariana Islands">Northern Mariana Islands</option>
      <option value="Norway">Norway</option>
      <option value="Oman">Oman</option>
      <option value="Pakistan">Pakistan</option>
      <option value="Palau">Palau</option>
      <option value="Palestinian Territory, Occupied">Palestinian Territory</option>
      <option value="Panama">Panama</option>
      <option value="Papua New Guinea">Papua New Guinea</option>
      <option value="Paraguay">Paraguay</option>
      <option value="Peru">Peru</option>
      <option value="Philippines">Philippines</option>
      <option value="Pitcairn">Pitcairn</option>
      <option value="Poland">Poland</option>
      <option value="Portugal">Portugal</option>
      <option value="Puerto Rico">Puerto Rico</option>
      <option value="Qatar">Qatar</option>
      <option value="Reunion">Reunion</option>
      <option value="Romania">Romania</option>
      <option value="Russian Federation">Russian Federation</option>
      <option value="Rwanda">Rwanda</option>
      <option value="Saint Barthelemy">Saint Barthelemy</option>
      <option value="Saint Helena">Saint Helena</option>
      <option value="Saint Kitts and Nevis">Saint Kitts and Nevis</option>
      <option value="Saint Lucia">Saint Lucia</option>
      <option value="Saint Martin (French part)">Saint Martin</option>
      <option value="Saint Pierre and Miquelon">Saint Pierre and Miquelon</option>
      <option value="Saint Vincent and the Grenadines">Saint Vincent and the Grenadines</option>
      <option value="Samoa">Samoa</option>
      <option value="San Marino">San Marino</option>
      <option value="Sao Tome and Principe">Sao Tome and Principe</option>
      <option value="Saudi Arabia">Saudi Arabia</option>
      <option value="Senegal">Senegal</option>
      <option value="Serbia">Serbia</option>
      <option value="Seychelles">Seychelles</option>
      <option value="Sierra Leone">Sierra Leone</option>
      <option value="Singapore">Singapore</option>
      <option value="Sint Maarten">Sint Maarten</option>
      <option value="Slovakia">Slovakia</option>
      <option value="Slovenia">Slovenia</option>
      <option value="Solomon Islands">Solomon Islands</option>
      <option value="Somalia">Somalia</option>
      <option value="South Africa">South Africa</option>
      <option value="South Georgia and the South Sandwich Islands">South Georgia and the South Sandwich Islands</option>
      <option value="South Sudan">South Sudan</option>
      <option value="Spain">Spain</option>
      <option value="Sri Lanka">Sri Lanka</option>
      <option value="Sudan">Sudan</option>
      <option value="Suriname">Suriname</option>
      <option value="Svalbard and Jan Mayen">Svalbard and Jan Mayen</option>
      <option value="Swaziland">Swaziland</option>
      <option value="Sweden">Sweden</option>
      <option value="Switzerland">Switzerland</option>
      <option value="Syria">Syria</option>
      <option value="Taiwan">Taiwan</option>
      <option value="Tajikistan">Tajikistan</option>
      <option value="Tanzania, United Republic of">United Republic of Tanzania</option>
      <option value="Thailand">Thailand</option>
      <option value="Timor-Leste">Timor-Leste</option>
      <option value="Togo">Togo</option>
      <option value="Tokelau">Tokelau</option>
      <option value="Tonga">Tonga</option>
      <option value="Trinidad and Tobago">Trinidad and Tobago</option>
      <option value="Tunisia">Tunisia</option>
      <option value="Turkey">Turkey</option>
      <option value="Turkmenistan">Turkmenistan</option>
      <option value="Turks and Caicos Islands">Turks and Caicos Islands</option>
      <option value="Tuvalu">Tuvalu</option>
      <option value="Uganda">Uganda</option>
      <option value="Ukraine">Ukraine</option>
      <option value="United Arab Emirates">United Arab Emirates</option>
      <option value="United Kingdom">United Kingdom</option>
      <option value="United States Minor Outlying Islands">United States Minor Outlying Islands</option>
      <option value="Uruguay">Uruguay</option>
      <option value="Uzbekistan">Uzbekistan</option>
      <option value="Vanuatu">Vanuatu</option>
      <option value="Venezuela">Venezuela</option>
      <option value="Viet Nam">Viet Nam</option>
      <option value="Virgin Islands, British">Virgin Islands, British</option>
      <option value="Virgin Islands, U.S.">Virgin Islands, U.S.</option>
      <option value="Wallis and Futuna">Wallis and Futuna</option>
      <option value="Yemen">Yemen</option>
      <option value="Zambia">Zambia</option>
      <option value="Zimbabwe">Zimbabwe</option>
    </select>
  </div>
  <div class="marketo-form-field">
    <label for="License_Count__c" class="mrkto_select_lbl">How many licenses will you need?<span class="requiredAsterix">*</span></label>
    <select id="1298_License_Count__c" class="mrkto_select" name="License_Count__c" required="" oninvalid="setCustomValidity('This field is required.)" oninput="setCustomValidity('')">
      <option value="">Select...</option>
      <option value="1">1 User</option>
      <option value="2">2 to 10</option>
      <option value="11">11 to 20</option>
      <option value="21">21 to 50</option>
      <option value="51">51+</option>
    </select>
  </div>
  <div class="marketo-form-field">
    <span>By filling out this form and clicking submit, you acknowledge our<span>&nbsp;</span></span><a href="https://www.pluralsight.com/privacy" target="_blank">privacy policy</a><span>.</span>
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_UTM_Source__c" class="mrkto_hidden" name="UTM_Source__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_UTM_Medium__c" class="mrkto_hidden" name="UTM_Medium__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_UTM_Campaign__c" class="mrkto_hidden" name="UTM_Campaign__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_UTM_Content__c" class="mrkto_hidden" name="UTM_Content__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_UTM_Term__c" class="mrkto_hidden" name="UTM_Term__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_mcvisid__c" class="mrkto_hidden" name="mcvisid__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_GCLID__c" class="mrkto_hidden" name="GCLID__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1298_Electronic_Person__c" class="mrkto_hidden" name="Electronic_Person__c">
  </div>
  <div class="marketo-form-field">
    <button type="submit" class="mrkto_submit">Submit</button>
    <input type="hidden" name="formid" value="1298">
  </div>
</form>

<form id="customMarketo_1041" data-mkto-id="1041">
  <div class="marketo-form-field">
    <label for="Email" class="mrkto_text_lbl">Email Address:<span class="requiredAsterix">*</span></label>
    <input type="email" id="1041_Email" class="mrkto_email" name="Email" required="" oninvalid="setCustomValidity('Must be valid email. example@yourdomain.com)" oninput="setCustomValidity('')" maxlength="255">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_UTM_Source__c" class="mrkto_hidden" name="UTM_Source__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_UTM_Medium__c" class="mrkto_hidden" name="UTM_Medium__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_UTM_Campaign__c" class="mrkto_hidden" name="UTM_Campaign__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_UTM_Content__c" class="mrkto_hidden" name="UTM_Content__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_UTM_Term__c" class="mrkto_hidden" name="UTM_Term__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_mcvisid__c" class="mrkto_hidden" name="mcvisid__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_GCLID__c" class="mrkto_hidden" name="GCLID__c">
  </div>
  <div class="marketo-form-field">
    <input type="hidden" id="1041_Electronic_Person__c" class="mrkto_hidden" name="Electronic_Person__c">
  </div>
  <div class="marketo-form-field">
    <input type="checkbox" id="1041_Single_Opt_In__c" class="mrkto_checkbox" name="Single_Opt_In__c" value="yes">
    <label for="1041_Single_Opt_In__c" class="mrkto_checkbox_lbl">I would like to receive emails from Pluralsight</label>
  </div>
  <div class="marketo-form-field">
    <button type="submit" class="mrkto_submit">Submit</button>
    <input type="hidden" name="formid" value="1041">
  </div>
</form>

Text Content

Skip to content
 * Pluralsight
 * Skills
 * A Cloud Guru
 * Flow
 * Blog


An avatar icon Sign in
 * A skills logo
   
   Sign in to Skills
   
   The Skills product logo icon
   
 * A Cloud Guru small logo icon
   
   Sign in to A Cloud Guru
   
   A Cloud Guru logo, color version
   
 * Flow product logo
   
   Sign in to Flow
   
   The Flow product logo icon
   

The Pluralsight logo, color version
 * Explore
 * Software dev
 * Cloud
 * IT Ops
 * Data
 * Security
 * Leadership

 * A search magnifying glass icon
   A search magnifying glass icon
   
 * Contact sales
 * View plans

Close Icon

Sign in Menu
 *  * A skills logo
      
      Sign in to Skills
      
      The Skills product logo icon
      
    * A Cloud Guru small logo icon
      
      Sign in to A Cloud Guru
      
      A Cloud Guru logo, color version
      
    * Flow product logo
      
      Sign in to Flow
      
      The Flow product logo icon
      

 *  * Pluralsight
    * Skills
    * A Cloud Guru
    * Flow
    * Blog

A search magnifying glass
 * Explore
 * Software dev
 * Cloud
 * IT Ops
 * Data
 * Security
 * Leadership

 * Contact sales
 * View plans


CONTACT SALES

First Name*
Last Name*
Email Address*
Company*
Job Title*
Phone*
Country* Select... United States Afghanistan Aland Islands Albania Algeria
American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina
Armenia Aruba Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados
Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bonaire, Saint Eustatius and
Saba Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean
Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon
Canada Cape Verde Cayman Islands Central African Republic Chad Chile China
Christmas Island Cocos (Keeling) Islands Colombia Comoros Congo Democratic
Republic of the Congo Cook Islands Costa Rica Cote d'Ivoire Croatia Cuba Curacao
Cyprus Czech Republic Denmark Djibouti Dominica Dominican Republic Ecuador Egypt
El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands
(Malvinas) Faroe Islands Fiji Finland France French Guiana French Polynesia
French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar Greece
Greenland Grenada Guadeloupe Guam Guatemala Guernsey Guinea Guinea-Bissau Guyana
Haiti Heard Island and McDonald Islands Holy See (Vatican City State) Honduras
Hong Kong Hungary Iceland India Indonesia Iran Iraq Ireland Isle of Man Israel
Italy Jamaica Japan Jersey Jordan Kazakhstan Kenya Kiribati Korea Kosovo Kuwait
Kyrgyzstan Lao People's Democratic Republic Latvia Lebanon Lesotho Liberia
Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Republic of
Macedonia Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands
Martinique Mauritania Mauritius Mayotte Mexico Federated States of Micronesia
Republic of Moldova Monaco Mongolia Montenegro Montserrat Morocco Mozambique
Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New
Zealand Nicaragua Niger Nigeria Niue Norfolk Island Northern Mariana Islands
Norway Oman Pakistan Palau Palestinian Territory Panama Papua New Guinea
Paraguay Peru Philippines Pitcairn Poland Portugal Puerto Rico Qatar Reunion
Romania Russian Federation Rwanda Saint Barthelemy Saint Helena Saint Kitts and
Nevis Saint Lucia Saint Martin Saint Pierre and Miquelon Saint Vincent and the
Grenadines Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia
Seychelles Sierra Leone Singapore Sint Maarten Slovakia Slovenia Solomon Islands
Somalia South Africa South Georgia and the South Sandwich Islands South Sudan
Spain Sri Lanka Sudan Suriname Svalbard and Jan Mayen Swaziland Sweden
Switzerland Syria Taiwan Tajikistan United Republic of Tanzania Thailand
Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan
Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United
Kingdom United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu
Venezuela Viet Nam Virgin Islands, British Virgin Islands, U.S. Wallis and
Futuna Yemen Zambia Zimbabwe
How many licenses will you need?* Select... 1 User 2 to 10 11 to 20 21 to 50 51+
By filling out this form and clicking submit, you acknowledge our privacy
policy.








Submit

Thank you!

Close button icon

 1. Blog
 2. Blog


CREATING A LARGE LANGUAGE MODEL FROM SCRATCH: A BEGINNER'S GUIDE

A step-by-step guide on how to create your first Large Language Model (LLM),
even if you're new to natural language processing.

By Axel Sirota

Feb 15, 2024 • 10 Minute Read


 * Data
 * AI & Machine Learning

--------------------------------------------------------------------------------

Subscribe to the newsletter

Imagine stepping into the world of language models as a painter stepping in
front of a blank canvas. The canvas here is the vast potential of Natural
Language Processing (NLP), and your paintbrush is the understanding of Large
Language Models (LLMs). This article aims to guide you, a data practitioner new
to NLP, in creating your first Large Language Model from scratch, focusing on
the Transformer architecture and utilizing TensorFlow and Keras.


TABLE OF CONTENTS

 * Understanding the basics
 * Building the Transformer with TensorFlow and Keras
 * Training the model
 * Implementing transfer learning with Hugging Face
 * Further resources


UNDERSTANDING THE BASICS


WHAT IS A LARGE LANGUAGE MODEL?

A Large Language Model (LLM) is akin to a highly skilled linguist, capable of
understanding, interpreting, and generating human language. In the world of
artificial intelligence, it's a complex model trained on vast amounts of text
data.



It is a type of artificial intelligence model specifically designed to
understand, interpret, generate, and sometimes translate human language. These
models are a subset of machine learning models and are part of the broader field
of natural language processing (NLP). Let's break down the concept to understand
it better:

KEY CHARACTERISTICS OF LARGE LANGUAGE MODELS:

 1. Large Scale: As the name suggests, these models are 'large' not just in
    their physical size in terms of the number of parameters they contain, but
    also in the vast amount of data they are trained on. Models like GPT-3,
    BERT, and T5 consist of billions of parameters and are trained on diverse
    datasets comprising texts from books, websites, and other sources.

 2. Understanding Context: One of the primary strengths of LLMs is their ability
    to understand the context. Unlike earlier models that focused on individual
    words or phrases in isolation, LLMs consider the entire sentence or
    paragraph, allowing them to comprehend nuances, ambiguities, and the flow of
    language.

 3. Generating Human-Like Text: LLMs are known for their ability to generate
    text that closely resembles human writing. This includes completing
    sentences, writing essays, creating poetry, or even generating code. The
    advanced models can maintain a theme or style over long passages.

 4. Adaptability: These models can be fine-tuned or adapted for specific tasks,
    like answering questions, translating languages, summarizing texts, or even
    creating content for specific domains like legal, medical, or technical
    fields.


THE TRANSFORMER: THE ENGINE BEHIND LLMS


At the heart of most LLMs is the Transformer architecture, introduced in the
paper "Attention Is All You Need" by Vaswani et al. (2017). Imagine the
Transformer as an advanced orchestra, where different instruments (layers and
attention mechanisms) work in harmony to understand and generate language.




TENSORFLOW AND KERAS: YOUR BUILDING BLOCKS

TensorFlow, with its high-level API Keras, is like the set of high-quality tools
and materials you need to start painting. It simplifies building and training
complex models.


BUILDING THE TRANSFORMER WITH TENSORFLOW AND KERAS


STEP 1: SETTING UP YOUR ENVIRONMENT

Before diving into code, ensure you have TensorFlow installed in your Python
environment:

      pip install tensorflow
    


STEP 2: THE ENCODER AND DECODER LAYERS

The Transformer model consists of encoders and decoders. Think of encoders as
scribes, absorbing information, and decoders as orators, producing meaningful
language.

ENCODER LAYER:

      import tensorflow as tf
from tensorflow.keras.layers import MultiHeadAttention, LayerNormalization, Dense

class TransformerEncoderLayer(tf.keras.layers.Layer):
    def __init__(self, d_model, num_heads, dff, rate=0.1):
        super(TransformerEncoderLayer, self).__init__()
        self.mha = MultiHeadAttention(num_heads, d_model)
        self.ffn = tf.keras.Sequential([
            Dense(dff, activation='relu'), 
            Dense(d_model)
        ])

        self.layernorm1 = LayerNormalization(epsilon=1e-6)
        self.layernorm2 = LayerNormalization(epsilon=1e-6)
        self.dropout1 = tf.keras.layers.Dropout(rate)
        self.dropout2 = tf.keras.layers.Dropout(rate)
    
    def call(self, x, training):
        attn_output = self.mha(x, x, x)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.layernorm1(x + attn_output)

        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        out2 = self.layernorm2(out1 + ffn_output)

        return out2
    

This piece of code defines a Transformer Encoder Layer using TensorFlow and
Keras, which are powerful tools for building neural networks. Let’s break the
code down:

IMPORT STATEMENTS:


      import tensorflow as tf
from tensorflow.keras.layers import MultiHeadAttention, LayerNormalization, Dense
    

Here, we import TensorFlow and specific layers from Keras needed for building
the encoder layer. These layers include MultiHeadAttention for handling the
attention mechanism, LayerNormalization for stabilizing the neural network, and
Dense for fully connected layers.

DEFINING THE TRANSFORMERENCODERLAYER CLASS:

      class TransformerEncoderLayer(tf.keras.layers.Layer):
    

This line begins the definition of the TransformerEncoderLayer class, which
inherits from TensorFlow's Layer class. This custom layer will form one part of
the Transformer model.

INITIALIZATION METHOD (__INIT__):

      def __init__(self, d_model, num_heads, dff, rate=0.1):
    super(TransformerEncoderLayer, self).__init__()
    

The __init__ method initializes the encoder layer. It takes several parameters:

 * d_model: The dimensionality of the input (and output) of the layer.
 * num_heads: The number of heads in the multi-head attention mechanism.
 * dff: The dimensionality of the inner layer in the feed-forward network.
 * rate: The dropout rate used for regularization.

MULTI-HEAD ATTENTION AND FEED-FORWARD NETWORK:

      self.mha = MultiHeadAttention(num_heads, d_model)
self.ffn = tf.keras.Sequential([Dense(dff, activation='relu'), Dense(d_model)])
    

The encoder layer consists of a multi-head attention mechanism and a
feed-forward neural network. self.mha is an instance of MultiHeadAttention, and
self.ffn is a simple two-layer feed-forward network with a ReLU activation in
between.

LAYER NORMALIZATION AND DROPOUT:

      self.layernorm1 = LayerNormalization(epsilon=1e-6)
self.layernorm2 = LayerNormalization(epsilon=1e-6)
self.dropout1 = tf.keras.layers.Dropout(rate)
self.dropout2 = tf.keras.layers.Dropout(rate)
    

These lines create instances of layer normalization and dropout layers. Layer
normalization helps in stabilizing the output of each layer, and dropout
prevents overfitting.

ATTENTION AND FEED-FORWARD OPERATIONS:

      attn_output = self.mha(x, x, x)
attn_output = self.dropout1(attn_output, training=training)
out1 = self.layernorm1(x + attn_output)
    

Here, the layer processes its input x through the multi-head attention
mechanism, applies dropout, and then layer normalization. It's followed by the
feed-forward network operation and another round of dropout and normalization.

DECODER LAYER:

      class TransformerDecoderLayer(tf.keras.layers.Layer):
    def __init__(self, d_model, num_heads, dff, rate=0.1):
        super(TransformerDecoderLayer, self).__init__()
        self.mha1 = MultiHeadAttention(num_heads, d_model)
        self.mha2 = MultiHeadAttention(num_heads, d_model)

        self.ffn = tf.keras.Sequential([
            Dense(dff, activation='relu'), 
            Dense(d_model)
        ])

        self.layernorm1 = LayerNormalization(epsilon=1e-6)
        self.layernorm2 = LayerNormalization(epsilon=1e-6)
        self.layernorm3 = LayerNormalization(epsilon=1e-6)
        
        self.dropout1 = tf.keras.layers.Dropout(rate)
        self.dropout2 = tf.keras.layers.Dropout(rate)
        self.dropout3 = tf.keras.layers.Dropout(rate)

    def call(self, x, enc_output, training, look_ahead_mask, padding_mask):
        attn1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask)
        attn1 = self.dropout1(attn1, training=training)
        out1 = self.layernorm1(attn1 + x)

        attn2, attn_weights_block2 = self.mha2(enc_output, enc_output, out1, padding_mask)
        attn2 = self.dropout2(attn2, training=training)
        out2 = self.layernorm2(attn2 + out1)

        ffn_output = self.ffn(out2)
        ffn_output = self.dropout3(ffn_output, training=training)
        out3 = self.layernorm3(ffn_output + out2)

        return out3, attn_weights_block1, attn_weights_block2
    

The Transformer Decoder is an essential part of the Transformer model, often
used in tasks like machine translation, text generation, and more. Let’s break
down the parts of the code that are new:

ATTENTION LAYERS

Two multi-head attention layers (mha1 and mha2) are defined. mha1 is used for
self-attention within the decoder, and mha2 is used for attention over the
encoder's output. The feed-forward network (ffn) follows a similar structure to
the encoder.


THE CALL METHOD:


      def call(self, x, enc_output, training, look_ahead_mask, padding_mask):
    

This method is where the layer's operations are defined. It takes additional
parameters compared to the encoder:

 * enc_output: Output from the encoder.

 * look_ahead_mask: To mask future tokens in a sequence (for self-attention).

 * padding_mask: To mask padded positions (for encoder-decoder attention).

ATTENTION AND FEED-FORWARD OPERATIONS:

      attn1, attn_weights_block1 = self.mha1(x, x, x, look_ahead_mask)
attn2, attn_weights_block2 = self.mha2(enc_output, enc_output, out1, padding_mask)
    

The decoder processes its input through two multi-head attention layers. The
first one (attn1) is self-attention with a look-ahead mask, and the second one
(attn2) focuses on the encoder's output. This is followed by the feed-forward
network. Each step involves dropout and normalization.


STEP 3: ASSEMBLING THE TRANSFORMER

Think of this step as assembling your orchestra. Each encoder and decoder layer
is an instrument, and you're arranging them to create harmony.

FULL TRANSFORMER MODEL:

      class Transformer(tf.keras.Model):
    def __init__(self, num_layers, d_model, num_heads, dff, input_vocab_size, 
                 target_vocab_size, pe_input, pe_target, rate=0.1):
        super(Transformer, self).__init__()
        self.encoder = Encoder(num_layers, d_model, num_heads, dff, 
                               input_vocab_size, pe_input, rate)
        self.decoder = Decoder(num_layers, d_model, num_heads, dff, 
                               target_vocab_size, pe_target, rate)

        self.final_layer = tf.keras.layers.Dense(target_vocab_size)

    def call(self, inp, tar, training, enc_padding_mask, 
             look_ahead_mask, dec_padding_mask):
        enc_output = self.encoder(inp, training, enc_padding_mask)
        dec_output, attention_weights = self.decoder(
            tar, enc_output, training, look_ahead_mask, dec_padding_mask)

        final_output = self.final_layer(dec_output)

    


TRAINING THE MODEL

With the Transformer model assembled, it's time to train it. This process is
like teaching the orchestra to play a symphony, where the symphony is the task
you want your model to perform (e.g., language translation, text generation).


PREPARING THE DATA

Data preparation involves collecting a large dataset of text and processing it
into a format suitable for training. TensorFlow's data API can be used for this
purpose.


TRAINING LOOP

      for epoch in range(epochs):
    # Initialize the training step
    for (batch, (inp, tar)) in enumerate(dataset):
        # Training code here
    

However in this following section we will explore how to leverage existent LLMs
by using Transfer Learning.


IMPLEMENTING TRANSFER LEARNING WITH HUGGING FACE

Transfer learning in the context of LLMs is akin to an apprentice learning from
a master craftsman. Instead of starting from scratch, you leverage a pre-trained
model and fine-tune it for your specific task. Hugging Face provides an
extensive library of pre-trained models which can be fine-tuned for various NLP
tasks.


SETTING UP HUGGING FACE TRANSFORMERS

First, you need to install the Hugging Face transformers library:

      pip install transformers
    


LOADING A PRE-TRAINED MODEL

Choose a pre-trained model from Hugging Face's model hub. For this example,
let's use bert-base-uncased, a popular BERT model:

      from transformers import BertTokenizer, TFBertModel

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = TFBertModel.from_pretrained('bert-base-uncased')
    


PREPARING DATA FOR FINE-TUNING

Suppose you're fine-tuning the model for a sentiment analysis task. First,
preprocess your data:

      # Example sentences
sentences = ["I love this product!", "This is a bad product."]

# Tokenize sentences
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="tf")
    

Notize we have to use BERT tokenizer to ensure everything is tokenized and
padded exactly as BERT likes.


FINE-TUNING THE MODEL

Now, you can add a classification layer on top of the pre-trained model and
fine-tune it:

      from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

# Define input layers
input_ids = Input(shape=(None,), dtype='int32', name="input_ids")
attention_mask = Input(shape=(None,), dtype='int32', name="attention_mask")

# Load the pre-trained BERT model
bert = model(input_ids, attention_mask=attention_mask)

# Add a classification layer on top
x = bert.last_hidden_state[:, 0, :]
x = Dense(128, activation='relu')(x)
output = Dense(1, activation='sigmoid')(x)

# Construct the final model
fine_tuned_model = Model(inputs=[input_ids, attention_mask], outputs=[output])

# Compile the model
fine_tuned_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Example labels for the sentences
labels = [1, 0]  # 1 for positive, 0 for negative sentiment

# Train the model
fine_tuned_model.fit(inputs, labels, epochs=3, batch_size=32)

    


TESTING THE FINE-TUNED MODEL

Finally, test the fine-tuned model on new sentences:

      test_sentences = ["I am not sure about this product.", "Absolutely fantastic!"]
test_inputs = tokenizer(test_sentences, padding=True, truncation=True, return_tensors="tf")

predictions = fine_tuned_model.predict(test_inputs)

# Interpret the predictions
for sentence, prediction in zip(test_sentences, predictions):
    sentiment = "Positive" if prediction > 0.5 else "Negative"
    print(f"Sentence: '{sentence}' - Sentiment: {sentiment}")
    


CONCLUSION

Creating an LLM from scratch is an intricate yet immensely rewarding process. By
understanding and building upon the Transformer architecture with TensorFlow and
Keras, and leveraging transfer learning through Hugging Face, you can create a
model that's not just a powerful NLP tool but a reflection of your unique
approach to understanding language.

As you continue on this journey, remember that the field of NLP is
ever-evolving, and there's always more to learn and explore. Happy modeling!


FURTHER LEARNING RESOURCES

 * Working with Pre-trained NLP Models video course
 * Pluralsight's Large Language Models (LLM) learning path
 * Introduction to Large Language Models for Data Practitioners video course
 * A blueprint for responsible innovation with Large Language Models
 * LLMs in action: How to use them for real-world applications



Axel S.

Axel Sirota is a Microsoft Certified Trainer with a deep interest in Deep
Learning and Machine Learning Operations. He has a Masters degree in Mathematics
and after researching in Probability, Statistics and Machine Learning
optimization, he works as an AI and Cloud Consultant as well as being an Author
and Instructor at Pluralsight, Develop Intelligence, and O'Reilly Media.

More about this author



 * SUPPORT
   
   * Contact
   * Help Center
   * IP Allowlist
   * Sitemap
   * Download Pluralsight
   * Skills Plans
   * A Cloud Guru Plans
   * Flow Plans


 * COMMUNITY
   
   * Guides
   * Teach
   * Partner with Pluralsight
   * Affiliate Partners
   * Pluralsight One
   * Authors


 * COMPANY
   
   * About Us
   * Careers
   * Newsroom
   * Resources


 * INDUSTRIES
   
   * Education
   * Financial Services (FSBI)
   * Healthcare
   * Insurance
   * Non-Profit
   * Public Sector


 * NEWSLETTER
   
   Email Address:*
   
   
   
   
   
   
   
   
   I would like to receive emails from Pluralsight
   Submit
   
   Thank you!
   
   * A facebook icon
   * 
   * 
   * 
   * 

--------------------------------------------------------------------------------

Pluralsight logo Copyright © 2004 - 2024 Pluralsight LLC. All rights reserved
 * Terms of Use
 * Privacy Notice
 * Modern Slavery Statement