blogs.sas.com
Open in
urlscan Pro
149.173.160.44
Public Scan
URL:
https://blogs.sas.com/content/sgf/2021/02/22/deleting-a-substring-from-a-sas-string/
Submission: On March 08 via api from US — Scanned from DE
Submission: On March 08 via api from US — Scanned from DE
Form analysis
3 forms found in the DOMGET https://blogs.sas.com/content/
<form role="search" method="get" action="https://blogs.sas.com/content/">
<label for="searchFieldComponent">Search</label>
<input type="text" id="searchFieldComponent" name="s" value="" placeholder="Search">
<select name="fq" multiple="" style="display: none;">
</select>
<input class="searchButton" type="submit" value="Search">
</form>
GET https://blogs.sas.com/content/
<form role="search" action="https://blogs.sas.com/content/" method="get">
<input type="text" name="s" class="query" value="" placeholder="Search...">
<button class="search-button" type="submit"><i class="fa fa-search"></i></button>
</form>
POST https://blogs.sas.com/content/sgf/wp-comments-post.php
<form action="https://blogs.sas.com/content/sgf/wp-comments-post.php" method="post" id="commentform" class="comment-form" novalidate="">
<p>
<textarea name="comment" id="comment" cols="45" rows="8" aria-required="true" placeholder="Your Comment"></textarea>
</p>
<p>
<input name="author" id="author" type="text" size="30" aria-required="true" placeholder="Your Name" value="">
</p>
<p>
<input name="email" id="email" type="text" size="30" aria-required="true" placeholder="Your Email" value="">
</p>
<p>
<input name="url" id="url" type="text" size="30" placeholder="Your Website" value="">
</p>
<p class="comment-form-cookies-consent"><input id="wp-comment-cookies-consent" name="wp-comment-cookies-consent" type="checkbox" value="yes"> <label for="wp-comment-cookies-consent">Save my name, email, and website in this browser for the next time
I comment.</label></p>
<p class="form-submit"><input name="submit" type="submit" id="comment-submit" class="submit" value="Post Comment"> <input type="hidden" name="comment_post_ID" value="29001" id="comment_post_ID">
<input type="hidden" name="comment_parent" id="comment_parent" value="0">
</p>
<p style="display: none;"><input type="hidden" id="akismet_comment_nonce" name="akismet_comment_nonce" value="743ee9e01f"></p><textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100" style="display: none !important;"></textarea><input
type="hidden" id="ak_js" name="ak_js" value="1646764933230">
</form>
Text Content
Skip to main content BLOGS * Solutions * By Industry * Agriculture * Banking * Education * Health Care * Insurance * Life Sciences * Manufacturing * Public Sector * Retail & Consumer Goods * Small & Midsize Business * Telecom, Media & Technology * Utilities * All Industries * By Technology & Topic * Advanced Analytics * AI & Machine Learning * Cloud * Data Management * Decisioning * Fraud & Security Intelligence * Internet of Things (IoT) * Marketing * Microsoft Azure * Open Integration * Operationalizing Analytics * Risk Management * All Technologies & Topics * By Product * SAS Viya * SAS Customer Intelligence 360 * SAS Detection & Investigation * SAS Model Manager * SAS Visual Analytics * SAS Visual Data Mining & Machine Learning * SAS Visual Forecasting * All Products * Free Software Trials * By Role * Developers * Job Seekers * Marketers * Novice Users * Partners * Small & Midsize Business * Students & Professors * Thought Leaders * Try / Buy * Free Software Trials * SAS Visual Data Science Decisioning * Predict & Plan Consumer Demand * All Trials * Buy * Request a Demo * Request Software Pricing * Contact a SAS Sales Representative * Partner Marketplace * Why SAS * Analyst Viewpoints * Company Awards * Company Leadership * Customer Success Stories * Thought Leadership * Analytics * AI * Big Data & IoT * Data Management * Fraud & Security * Marketing * Risk Management * Innovative R&D Design * News Coverage & Highlights * Partner Ecosystem * World-Class Support Services * Learning * Training * Free Training * e-Learning * Corporate Training * SAS Academy for Data Science * All Training * My Training * Courses * Programming 1 * Programming 2 * SAS Enterprise Guide 1 * SAS Programming for R Users * SAS Viya Overview * All Courses * Learning Formats * e-Learning * Live Web * Classroom Locations * Video Tutorials * All Learning Formats * Webinars * Ask the Expert * Certification * All Webinars * Certification * Base Programmer * Advanced Programmer * Data Scientist * Statistical Business Analyst * Predictive Modeler * All Certification * Exam Preparation * Certification Manager * Books * Certification Prep Guides * Discounts * Upcoming Titles * All Books * All Learning * Documentation * By Product * Base SAS * SAS Enterprise Guide * SAS Enterprise Miner * SAS/STAT * SAS Studio * SAS Visual Analytics * All Products * Programming * SAS Viya * SAS 9.4 & SAS Viya 3.5 * SAS 9.3 & Earlier * Administration * SAS Viya * SAS 9.4 & Earlier * Installation & Configuration * Install Center * System Requirements * Third-Party Software Reference * Technical Papers * All Documentation * Support & Services * Support Home * Knowledge Base * Installation Notes * Problem Notes * Usage Notes * Search All Notes * DATA Step Samples * Graphics Samples * Search All Samples * Product Security * All Knowledge Base * Support by Product * Base SAS * SAS Customer Intelligence 360 * SAS Enterprise Guide * SAS Enterprise Miner * SASPy * SAS/STAT * SAS Studio * SAS Visual Analytics * All Products * Focus Areas * Product Security * Support Services * Technical Support * Manage Your Tracks * Maintenance * Licensing Assistance * Support Services * Support Policies * Downloads * SAS Download Manager * SAS Universal Viewer * Standard Deployment Plans * All Downloads * Hot Fixes * SAS Viya * SAS 9.4 * Hot Fix Announcements * Hot Fix Tool * All Hot Fixes * Consulting Services * All Services * SAS Starter Kit * Community * New SAS Users * SAS Programming * Administration & Deployment * Data Management * Machine Learning * Data Visualization * Statistical Procedures * Developers * All Communities * SAS Analytics Explorers * Women in Analytics Network * User Groups * Academics * Academic Programs * Educators * Students * Free Academic Software * Academic Discounts * Partners * About Our Partner Program * Find a Partner * Sign in to PartnerNet * Partner Marketplace * Events * SAS Global Forum * Webinars * All Events * About * Our Company * Overview * Vision & Mission * Our Values * What We Stand For * Company Leadership * Company Information * Media Coverage * Annual Report * Newsletters * Company Awards * Analyst Viewpoints * Speaker Bureau * Trust Center * Careers * Overview * Our Culture * Our Storytellers * Internships * Early Career Programs * Search Jobs * News Room * Blogs * Office Information * Contact Us blogs.sas.com * sas.com * support.sas.com * blogs.sas.com * communities.sas.com * developer.sas.com Search * Select your region Visit the Cary, NC, USA corporate headquarters site Americas Europe Middle East & Africa Asia Pacific View our worldwide contacts list for help finding your region Americas * Argentina * Brasil * Canada (English) * Canada (Français) * Chile * Colombia * México * Peru * United States Europe * Albania * Belgium * Bosnia & Herz. * Česká Republika * Croatia * Danmark * Deutschland * Ελλάδα * España * France * Iceland * Ireland * Italia * Luxembourg * Magyarország * Montenegro * Nederland * Norge * North Macedonia * Österreich * Polska * Portugal * România * Россия / СНГ * Schweiz (Deutsch) * Serbia * Slovenia * Slovensko * Suisse (Français) * Suomi * Sverige * Türkiye * Україна * United Kingdom Middle East & Africa * Maroc * Middle East * Saudi Arabia * South Africa Asia Pacific * Australia * 中国 (简体中文) * Hong Kong * India * Indonesia (Bahasa) * Indonesia (English) * 日本 * 대한민국 * Malaysia * New Zealand * Philippines * Singapore * 台灣 (繁體中文) * Thailand (English) * ประเทศไทย (ภาษาไทย) * Sign In Hi ! * Sign Out Sign In Create Profile My SAS Get access to My SAS, trials, communities and more. Sign Out Edit Profile My SAS Get access to My SAS, trials, communities and more. * Worldwide Sites * Contact Us * SAS Sites * Search SAS Sites sas.com Support Blogs Communities Developer Curiosity Videos Merchandise Brand PartnerNet BLOGS BLOGS Navigate * All Topics * Advanced Analytics * Analytics * Artificial Intelligence * Customer Intelligence * Data for Good * Data Management * Data Visualization * Featured * Fraud & Security Intelligence * Internet of Things * Learn SAS * Machine Learning * Programming Tips * Risk Management * SAS Administrators * SAS Events * Students & Educators * All Industries * Banking * Communications * Education * Energy & Utilities * Government * Health Care * Hospitality * Insurance * Life Sciences * Manufacturing * Retail * Sports & Entertainment * Travel * Blog Directory * Subscribe DELETING A SUBSTRING FROM A SAS STRING 16 By Leonid Batkhan on SAS Users February 22, 2021 Topics | Learn SAS Programming Tips In my previous post, we addressed the problem of inserting substrings into SAS character strings. In this post we will solve a reverse problem of deleting substrings from SAS strings. See also: Inserting a substring into a SAS string These two complementary tasks are commonly used for character data manipulation during data cleansing and preparation to transform data to a shape suitable for analysis, text mining, reporting, modeling and decision making. As in the previous case of substring insertion, we will cover substring deletion for both, character variables and macro variables as both data objects are strings. The following diagram illustrates what we are going to achieve by deleting a substring from a string: Have you noticed a logical paradox? We take away a “pieceof” cake and get the whole thing as result! 😊 Now, let’s get serious. DELETING ALL INSTANCES OF A SUBSTRING FROM A CHARACTER VARIABLE Let’s suppose we have a variable STR whose values are sprinkled with some undesirable substring ‘<br>’ which we inherited from some HTML code where tag <br> denotes a line break. For our purposes, we want to remove all instances of those pesky <br>’s. First, let’s create a source data set imitating the described “contaminated” data: data HAVE; infile datalines truncover; input STR $100.; datalines; Some strings<br> have unwanted sub<br>strings in them<br> <br>A s<br>entence must not be cont<br>aminated with unwanted subs<br>trings Several line<br> breaks<br> are inserted here<br><br><br> <br>Resulting st<br>ring must be n<br>eat and f<br>ree from un<br>desirable substrings Ugly unwanted substrings<br><br> must <br>be<br> removed <br>Let's remove them <br>using S<br>A<br>S language Ex<br>periment is a<br>bout to b<br>egin <br>Simpli<br>city may sur<br>prise you<br><br> ; This DATA step creates WORK.HAVE data set that looks pretty ugly and is hardly usable: The following code, however, cleans it up removing all those unwanted substrings ‘<br>’: data WANT (keep=NEW_STR); length NEW_STR $100; SUB = '<br>'; set HAVE; NEW_STR = transtrn(STR,SUB,trimn('')); run; After this code runs, the data set WANT will look totally clean and usable: CODE HIGHLIGHTS * We use TRANSTRN(source, target, replacement) function that does exactly what we need - replaces or removes all occurrences of a substring (target) in a character string (source). To remove all occurrences of target, we specify replacement as TRIMN(""). The TRANSTRN function is similar to TRANWRD function which replaces all occurrences of a substring in a character string. While TRANWRD uses a single blank when the replacement string has a length of zero, TRANSTRN does allow the replacement string to have a length of zero which essentially means removing. * TRIMN(argument) function removes trailing blanks from character expressions and returns a string with a length of zero if its argument has missing value. It is similar to TRIM() function which removes trailing blanks from a character string and returns one blank if the string is missing. However, when it comes to removing (which is essentially replacement with zero length substring) the ability of TRIMN function to return a zero-length string makes all the difference. DELETING ALL INSTANCES OF A SUBSTRING FROM A SAS MACRO VARIABLE For macro variables, I can see two distinct methods of removing all occurrences of undesirable substring. METHOD 1: USING SAS DATA STEP Here is a code example: %let STR = Some strings<br> have unwanted sub<br>strings in them<br>; %let SUB = <br>; data _null_; NEW_STR = transtrn("&STR","&SUB",trimn('')); call symputx('NEW',NEW_STR); run; %put &=STR; %put &=NEW; In this code, we stick our macro variable value &STR in double quotes in the transtrn() function as the first argument (source). The macro variable value &SUB, also double quoted, is placed as a second argument. After variable NEW_STR is produced free from the &SUB substrings, we create a macro variable NEW using call symputx() routine. SAS log will show the old and new values: STR=Some strings<br> have unwanted sub<br>strings in them<br> NEW=Some strings have unwanted substrings in them METHOD 2: USING SAS MACRO LANGUAGE AND %SYSFUNC Here is a code example: %let STR = Some strings<br> have unwanted sub<br>strings in them<br>; %let SUB = <br>; %let NEW = %sysfunc(transtrn(&STR,&SUB,%sysfunc(trimn(%str())))); %put &=STR; %put &=NEW; DELETING SELECTED INSTANCE OF A SUBSTRING FROM A CHARACTER VARIABLE In many cases we need to remove not all substring instances form a string, but rather a specific occurrence of a substring. For example, in the following sentence (which is a quote by Albert Einstein) “I believe in intuitions and inspirations. I sometimes feel that I am right. I sometimes do not know that I am.” the second word “sometimes” was added by mistake. It needs to be removed. Here is a code example presenting two solutions of how such a deletion can be done: data A; length STR STR1 STR2 $250; STR = 'I believe in intuitions and inspirations. I sometimes feel that I am right. I sometimes do not know that I am.'; SUB = 'sometimes'; STR_LEN = length(STR); SUB_LEN = length(SUB); POS = find(STR,SUB,-STR_LEN); STR1 = catx(' ', substr(STR,1,POS-1), substr(STR,POS+SUB_LEN)); /* solution 1 */ STR2 = kupdate(STR,POS,SUB_LEN+1); /* solution 2 */ put STR1= / STR2=; run; The code will produce two correct identical values of this quote in the SAS log (notice, that the second instance of word “sometimes” is gone): STR1=I believe in intuitions and inspirations. I sometimes feel that I am right. I do not know that I am. STR2=I believe in intuitions and inspirations. I sometimes feel that I am right. I do not know that I am. CODE HIGHLIGHTS * LENGTH() function determines the length STR_LEN of our initial string STR and the length SUB_LEN of our substring SUB. * FIND() function determines position POS of the substring SUB to be deleted in the string STR. In this particular example, we used the fact, that the second occurrence of word “sometimes” is the first occurrence of this word when counted from right to left. That is indicated by the negative 3-rd argument (-STR_LEN) which means that FIND function searches STR for SUB starting from position STR_LEN from right to left. SOLUTION 1 This is the most traditional solution that cuts out two pieces of the string – before and after the substring being deleted – and then concatenates them together thus removing that substring: * substr(STR,1,POS-1) extracts the first part of the source string STR before the substring to be deleted: from position 1 to position POS-1. * substr(STR,POS+SUB_LEN) extracts the second part of the source string STR after the substring to be deleted: from position POS+SUB_LEN till the end of STR value (since the third argument, length, is not specified). * CATX() function stitches (concatenates) these two parts together thus eliminating the second word “sometimes”. It also removes leading and trailing blanks from each piece and separates the two pieces with blanks (as specified by its first argument). SOLUTION 2 KUPDATE() function provides more elegant (and shorter) solution. In the kupdate(STR,POS,SUB_LEN+1) expression: * The first argument specifies the source string STR. * The second argument POS specifies position of the beginning of the substring. * The third argument SUB_LEN+1 specifies length of the substring that we want to remove (+1 accounts for extra blank after word 'sometimes'. * Optional forth argument specifies “characters-to-replace” the substring. Since we omitted it (specified none), nothing will replace the substring, that is it will be deleted. CODE NOTES * If you know substring value and exact position in STR from which to delete that substring, you may skip FIND() part of the code and just specify position POS. * If you need to delete n-th instance of your substring, you may find its position by using FINDNTH() function described in my post Finding n-th instance of a substring within a string . DELETING SELECTED INSTANCE OF A SUBSTRING FROM A SAS MACRO VARIABLE Here is a code example of how to solve the same problem as it relates to SAS macro variables. For brevity, we provide just one solution using %sysfunc and KUPDATE() function: %let STR = I believe in intuitions and inspirations. I sometimes feel that I am right. I sometimes do not know that I am.; %let SUB = sometimes; %let POS = %sysfunc(find(&STR,&SUB,-%length(&STR))); %let STR2 = %sysfunc(kupdate(&STR,&POS,%eval(%length(&SUB)+1))); %put "&STR2"; This should produce the following corrected Einstein’s quote in the SAS log: "I believe in intuitions and inspirations. I sometimes feel that I am right. I do not know that I am." ADDITIONAL RESOURCES FOR SAS CHARACTER STRINGS PROCESSING * Inserting a substring into a SAS string * Removing repeated characters in SAS strings * How to unquote SAS character variable values * Expanding lengths of all character variables in SAS data sets * Finding n-th instance of a substring within a string YOUR THOUGHTS? Have you found this blog post useful? Please share your thoughts and feedback in the comments section below. WANT MORE GREAT INSIGHTS MONTHLY? | SUBSCRIBE TO THE SAS TECH REPORT Tags character data learn sas SAS Programmers text mining Share Twitter Facebook Pinterest LinkedIn Email XING ABOUT AUTHOR Leonid Batkhan * Website * LinkedIn Leonid Batkhan is an independent SAS consultant and blogger. He holds a Ph.D. in Computer Science and Automatic Control Systems and has been a SAS user for more than 25 years. From 1995 to 2021 he worked as a Data Management and Business Intelligence consultant at SAS Institute. During his career, Leonid has successfully implemented dozens of SAS applications and projects in various industries. All posts by Leonid Batkhan >>> 16 COMMENTS 1. Mike Zdeb on December 17, 2021 11:30 am Hi, curious ... what's the purpose of the variable SUB in this code. Thanks. data WANT (keep=NEW_STR); length NEW_STR $100; SUB = '<br>'; set HAVE; NEW_STR = transtrn(STR,'<br>',trimn('')); run; Reply * Leonid Batkhan on December 17, 2021 12:26 pm Hi Mike, Thank you for this catch. I meant to use SUB within the expression NEW_STR = transtrn(STR,SUB,trimn('')); instead of hard-coded NEW_STR = transtrn(STR,'<br>',trimn('')); I have corrected this in the blog. Reply 2. Louise Hadden on March 25, 2021 4:21 pm Nice article! I had not used two of the functions referenced and I'm excited to try them. Thank you. Reply * Leonid Batkhan on March 25, 2021 4:41 pm You are welcome, and thank you, Louise, for your lovely comment. I feel rewarded that even such a SAS pundit as you learn something new from my blog 🙂 Reply 3. Ksharp on February 28, 2021 7:41 am Perl Regular Express is good for this kind of question. data HAVE; infile datalines truncover; input STR $100.; datalines; Some strings<br> have unwanted sub<br>strings in them<br> <br>A s<br>entence must not be cont<br>aminated with unwanted subs<br>trings Several line<br> breaks<br> are inserted here<br><br><br> <br>Resulting st<br>ring must be n<br>eat and f<br>ree from un<br>desirable substrings Ugly unwanted substrings<br><br> must <br>be<br> removed <br>Let's remove them <br>using S<br>A<br>S language Ex<br>periment is a<br>bout to b<br>egin <br>Simpli<br>city may sur<br>prise you<br><br> ; data want; set have; want=prxchange('s/<.+?>//',-1,str); run; Reply * Leonid Batkhan on February 28, 2021 1:15 pm Thank you, Ksharp, for your constructive comment. Indeed, Perl regular expressions are very powerful and can be used in SAS via prxchange() function. However, I found it to be considerably less efficient than using SAS string manipulation functions. For example, I ran the following code, and it turned out that the transtrn(STR,'<br />',trimn('')) approach ran more than twice as fast as prxchange('s/<.+?>//',-1,str) : /* 25 sec */ data LONG; set HAVE; do i=1 to 10000000; output; end; run; /* 3:51 min = 231 sec */ data WANT; set LONG; want=prxchange('s/<.+?>//',-1,str); run; /* 1:48 min = 108 sec */ data WANT1; set LONG; want=transtrn(STR,'<br />',trimn('')); run; Could you run this (or similar) test on your machine to see if your results are consistent with mine? Reply * Ksharp on March 1, 2021 7:58 am Agree. But PRX could handle many tags like: <a href="." rel="nofollow ugc">... and so on</a> and have less code and are more powerful. Everyone has different preferences I guess. Reply * Leonid Batkhan on March 2, 2021 10:19 am To me it is always choice between 1) code length, 2) run time, and 3) code clarity. In my current projects I am dealing with rather large data, therefore program efficiency (run time) is a paramount. For this particular example, code length is not much different: transtrn(STR,'<br />',trimn('')) vs. prxchange('s/<.+?>//',-1,str) However, processing time for the first code snippet is considerably less. Besides, code clarity and repeatability is better. Having said that, for some string processing tasks (e.g. data validation) using regular expressions can produce more robust solutions. Reply 4. Allan Bowe on February 26, 2021 5:58 am SAS Rules, everywhere! A useful / instructive article, thanks for sharing. Reply * Leonid Batkhan on February 26, 2021 10:05 am You are welcome, Allan. Thanks for sharing your feedback! Reply 5. Ronan on February 25, 2021 11:32 am Nice pieces of code, especially the TRIMN('') pseudo-constant and the KUPDATE. Thanks for sharing ! What's more, according from the documentation - and to its 1st letter name, as well - the KUPDATE function generously applies to single-byte encoding variable *and* multi-bytes UTF also, therefore full viya enabled. Why not byte off more than you can chew ? 😉 Reply * Leonid Batkhan on February 25, 2021 11:41 am Thank you, Ronan, for your very constructive comment. Indeed, KUPDATE function is extremely powerful. Reply 6. Bartosz Jabłoński on February 22, 2021 1:26 pm Nice counterpart to the previous article. As an "extra", let me remind the COMPBL() function which allows to remove multiple blanks from a string 🙂 All the best Bart Reply * Leonid Batkhan on February 22, 2021 1:37 pm Thank you, Bart! Yes, COMPBL() is a very handy function. I also wrote UNDUPC() function that removes ANY repeated characters from a string. Reply 7. Bill Wisotsky on February 22, 2021 11:08 am Very useful. Thanks Reply * Leonid Batkhan on February 22, 2021 11:10 am Great, I am happy to hear that, Bill. I am sure you will put it to use. Reply LEAVE A REPLY CANCEL REPLY Save my name, email, and website in this browser for the next time I comment. This site uses Akismet to reduce spam. Learn how your comment data is processed. Back to Top Curiosity is our code. SAS analytics solutions transform data into intelligence, inspiring customers around the world to make bold new discoveries that drive progress. SAS gives you THE POWER TO KNOW®. Contact Us FOLLOW US * Facebook * Twitter * LinkedIn * YouTube * RSS * About SAS Discover our people, passion and forward-thinking technology * Accessibility Empower people of all abilities with accessible software * Blogs Stay connected to people, products and ideas from SAS * Careers Search for meaningful work in an award-winning culture * Certification Validate your technology skills and advance your career * Communities Find your SAS answers with help from online communities * Customer Stories Read about who’s working smarter with SAS * Documentation Browse products, system requirements and third-party usage * Industries Get industry-specific analytics solutions for every need * My SAS Get access to software orders, trials and more * Resource Center Explore our extensive library of resources to stay informed * Solutions Discover data, AI and analytics solutions for every industry * Students & Educators Find out how to get started learning or teaching SAS * Support Access documentation, tech support, tutorials and books * Training Learn top-rated analytics skills required in today’s market * Cookieeinstellungen * Privacy Statement * Terms of Use * © 2022 SAS Institute Inc. All Rights Reserved. * Contact Us * Share * Subscribe Share this Share this page with friends or colleagues.