dotnetcoretutorials.com Open in urlscan Pro
2606:4700:3036::6815:2368 Public Scan

Back to summary

Submitted URL:
http://dotnetcoretutorials.com/how-to-parse-xml-in-net-core/
Effective URL:
https://dotnetcoretutorials.com/how-to-parse-xml-in-net-core/
Submission: On June 12 via manual (June 12th 2023, 6:14:52 pm UTC) from US — Scanned from DE

Form analysis
3 forms found in the DOM

GET https://dotnetcoretutorials.com/

<form method="get" class="search-form navigation-search" action="https://dotnetcoretutorials.com/">
  <input type="search" class="search-field" value="" name="s" title="Search">
</form>

POST https://dotnetcoretutorials.com/wp-comments-post.php

<form action="https://dotnetcoretutorials.com/wp-comments-post.php" method="post" id="commentform" class="comment-form" novalidate="">
  <p class="comment-form-comment"><label for="comment" class="screen-reader-text">Comment</label><textarea id="comment" name="comment" cols="45" rows="8" required=""></textarea></p><label for="author" class="screen-reader-text">Name</label><input
    placeholder="Name *" id="author" name="author" type="text" value="" size="30" required="">
  <label for="email" class="screen-reader-text">Email</label><input placeholder="Email *" id="email" name="email" type="email" value="" size="30" required="">
  <label for="url" class="screen-reader-text">Website</label><input placeholder="Website" id="url" name="url" type="url" value="" size="30">
  <p class="comment-form-cookies-consent"><input id="wp-comment-cookies-consent" name="wp-comment-cookies-consent" type="checkbox" value="yes"> <label for="wp-comment-cookies-consent">Save my name, email, and website in this browser for the next time
      I comment.</label></p>
  <p class="form-submit"><input name="submit" type="submit" id="submit" class="submit" value="Post Comment"> <input type="hidden" name="comment_post_ID" value="1029" id="comment_post_ID">
    <input type="hidden" name="comment_parent" id="comment_parent" value="0">
  </p>
  <p style="display: none;"><input type="hidden" id="akismet_comment_nonce" name="akismet_comment_nonce" value="5ca744508f"></p>
  <p style="display: none !important;"><label>Δ<textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100"></textarea></label><input type="hidden" id="ak_js_1" name="ak_js" value="1686593681245">
    <script>
      document.getElementById("ak_js_1").setAttribute("value", (new Date()).getTime());
    </script>
  </p>
</form>

GET https://dotnetcoretutorials.com/

<form method="get" class="search-form" action="https://dotnetcoretutorials.com/">
  <label>
    <span class="screen-reader-text">Search for:</span>
    <input type="search" class="search-field" placeholder="Search …" value="" name="s" title="Search for:">
  </label>
  <button class="search-submit" aria-label="Search"><span class="gp-icon icon-search"><svg viewBox="0 0 512 512" aria-hidden="true" xmlns="http://www.w3.org/2000/svg" width="1em" height="1em">
        <path fill-rule="evenodd" clip-rule="evenodd"
          d="M208 48c-88.366 0-160 71.634-160 160s71.634 160 160 160 160-71.634 160-160S296.366 48 208 48zM0 208C0 93.125 93.125 0 208 0s208 93.125 208 208c0 48.741-16.765 93.566-44.843 129.024l133.826 134.018c9.366 9.379 9.355 24.575-.025 33.941-9.379 9.366-24.575 9.355-33.941-.025L337.238 370.987C301.747 399.167 256.839 416 208 416 93.125 416 0 322.875 0 208z">
        </path>
      </svg></span></button>
</form>

Text Content

Skip to content


.NET Core Tutorials


Menu
 * Coding Tutorials
   * Integration Tutorials
   * .NET Core
   * ASP.NET Core
   * C#
 * Tooling Tutorials
 * Hosting/Deployments
 * General/News




HOW TO PARSE XML IN C# .NET

by Wade

One of the most popular posts on this blog is a very simple write-up on how to
parse JSON in C# .NET. I mostly wrote it because I thought that there was
definitely a “proper” way of doing things, and people were almost going out of
their way to make life difficult for themselves when working with JSON.

drone
Now Playing
world-1992
Now Playing
keyboard-1046
Now Playing
woman-38084
Now Playing
laptop-3145
Now Playing
tape-8573
Now Playing
rocket-235
Now Playing

Playback speed

1x Normal

Quality

Auto

Back


360p


Auto


Back


0.25x


0.5x


1x Normal


1.5x


2x


00:00/00:24

Skip
Ads by




I think working with XML is slightly different because (just IMO), there still
isn’t a “gold standard” library for XML.

Unlike JSON which has the incredible JSON.NET library to handle everything and
anything, the majority of cases when you work with XML you’ll use one of the
inbuilt XML Parsers inside the .NET Core framework. These can be frustrating at
times and incredibly brittle. Part of it is that they were created very early on
in the creation of .NET, and because of that, always need to be backwards
compatible so you lose out on things like Generics. The other part is that the
actual XML spec that involves things like namespaces and DTDs, while at first
look simple, can be incredibly harsh. By harsh I mean that things will just
plain not work if you are missing just one piece of the puzzle, and it can take
hours to work out what’s wrong.



Anyway, let’s jump right in and check out our options for working with XML in 
C# .NET.


EXTEND YOUR C# SPREADSHEET CAPABILITIES – GET STARTED WITH IRONXL

Manipulate Excel datasets with IronXL. Create, and parse Excel files in C# .NET
Core with IronXL. You can even parse into numeric value, Boolean value, arrays,
data tables, and datasets.

IronXL extends your abilities by letting you read and write excel file in C#
.NET Core in just a few lines of code. It works with other excel formats
XLS/XLSX/CSV/TSV. Our premium client portfolio (Lego and NASA) allows us to
offer you the best – join us with a 30-day free trial key or contact our 24-hour
engineering support team.




OUR EXAMPLE XML FILE

I’m going to be using a very simple XML file that has an element, an attribute
property and a list. I’ll use these as we check out the options so we are always
comparing trying to read the same file.

<?xml version="1.0" encoding="utf-8" ?>
<MyDocument xmlns="http://www.dotnetcoretutorials.com/namespace">
  <MyProperty>Abc</MyProperty>
  <MyAttributeProperty value="123" />
  <MyList>
    <MyListItem>1</MyListItem>
    <MyListItem>2</MyListItem>
    <MyListItem>3</MyListItem>
  </MyList>
</MyDocument>


USING XMLREADER

So the first option we have is using the class “XMLReader”. It’s a forward only
XML Parser (By that I mean that you read the file line by line almost). I’ll
warn you now, it’s very very primitive. For example our code might look a bit
like so :

XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;

using (var fileStream = File.OpenText("test.xml"))
using(XmlReader reader = XmlReader.Create(fileStream, settings))
{
    while(reader.Read())
    {
        switch(reader.NodeType)
        {
            case XmlNodeType.Element:
                Console.WriteLine($"Start Element: {reader.Name}. Has Attributes? : {reader.HasAttributes}");
                break;
            case XmlNodeType.Text:
                Console.WriteLine($"Inner Text: {reader.Value}");
                break;
            case XmlNodeType.EndElement:
                Console.WriteLine($"End Element: {reader.Name}");
                break;
            default:
                Console.WriteLine($"Unknown: {reader.NodeType}");
                break;
        }
    }
}

With the output looking like :



Unknown: XmlDeclaration
Start Element: MyDocument. Has Attributes? : True
Start Element: MyProperty. Has Attributes? : False
Inner Text: Abc
End Element: MyProperty
Start Element: MyAttributePropety. Has Attributes? : True
Start Element: MyList. Has Attributes? : False
Start Element: MyListItem. Has Attributes? : False
Inner Text: 1
End Element: MyListItem
Start Element: MyListItem. Has Attributes? : False
Inner Text: 2
End Element: MyListItem
Start Element: MyListItem. Has Attributes? : False
Inner Text: 3
End Element: MyListItem
End Element: MyList
End Element: MyDocument

It sort of reminds me of using ADO.NET and reading data row by row and trying to
store it in an object. The general idea is because you are only parsing line by
line, it’s less memory intensive. But you’re also having to handle each line
individually with any number of permutations of elements/attributes/lists etc. I
think the only reason to use this method would be if you have extremely large
XML files (100+MB), or you are looking for something very very specific. e.g.
you only want to read a single element from the file, and you don’t want to load
the entire thing while looking for that one element.

Another thing I will point out is that XML Namespaces and the difficulty around
those wasn’t there with XMLReader. It just sort of powered through and there
wasn’t any issue around prefixes, namespaces, DTDs etc.

But again in general, I wouldn’t use XMLReader in the majority of cases.




USING XPATHDOCUMENT/XPATHNAVIGATOR

So another way of getting individual XML Nodes, but being able to “search” a
document is using the XPathNavigator object.

First, the code :

using (var fileStream = File.Open("test.xml", FileMode.Open))
{
    //Load the file and create a navigator object. 
    XPathDocument xPath = new XPathDocument(fileStream);
    var navigator = xPath.CreateNavigator();

    //Compile the query with a namespace prefix. 
    XPathExpression query = navigator.Compile("ns:MyDocument/ns:MyProperty");

    //Do some BS to get the default namespace to actually be called ns. 
    var nameSpace = new XmlNamespaceManager(navigator.NameTable);
    nameSpace.AddNamespace("ns", "http://www.dotnetcoretutorials.com/namespace");
    query.SetContext(nameSpace);

    Console.WriteLine("My Property Value : " + navigator.SelectSingleNode(query).Value);
}

Now honestly… This is bad and I made it bad for a reason. Namespaces here are
really painful. In my particular case because I have a default namespace, this
was the only way I could find out there that would get the XPath working.
Without the namespace, things would actually be a cinch. So with that said I’m
going to admit something here… I have totally used string replace functions to
remove namespaces before… Now I know someone will jump in the comments and say
“but the XML spec says blah blah blah”. I honestly think every headache I’ve
ever had with working with XML has been because of namespaces.



So let me put a caveat on my recommendation here. If the document you are
working with does not make use of namespaces (Or you are willing to remove
them), and you need use an XPath expression to get a single node, then using the
XMLNavigator actually isn’t a bad option. But that’s a big if.


USING XMLDOCUMENT

XMLDocument can be thought of like an upgraded version of the XPathNavigator. It
has a few easier methods to load documents, and allows you to modify
XMLDocuments in memory too!

XmlDocument document = new XmlDocument();
document.Load("test.xml");

XmlNamespaceManager m = new XmlNamespaceManager(document.NameTable);
m.AddNamespace("ns", "http://www.dotnetcoretutorials.com/namespace");
Console.WriteLine(document.SelectSingleNode("ns:MyDocument/ns:MyProperty", m).InnerText);

Overall you still have to deal with some namespace funny business (e.g. Default
Namespaces are not handled great), and you still have to get each element one by
one as you need it, but I do think this is the best option if you are looking to
load out only a small subset of the XML doc. The fact you can modify the XML and
save it back to file is also a pretty good one.




USING XMLSERIALIZER

Now we are cooking with gas, XMLSerializer in my opinion is the very best way to
parse XML in .NET Core. If you’ve used JSONDocument from JSON.NET before, then
this is very close to being the same sort of setup.

First we simply create a class that models our actual XML file. We use a bunch
of attribute to specify how to read the doc, which namespace we are using, even
what type of element we are trying to deserialize (e.g. An attribute, element or
array).

[XmlRoot("MyDocument", Namespace = "http://www.dotnetcoretutorials.com/namespace")]
public class MyDocument
{
    public string MyProperty { get; set; }

    public MyAttributeProperty MyAttributeProperty { get; set; }

    [XmlArray]
    [XmlArrayItem(ElementName = "MyListItem")]
    public List MyList { get; set; }
}

public class MyAttributeProperty
{
    [XmlAttribute("value")]
    public int Value { get; set; }
}

Really really simple. And then the code to actually read our XML and turn it
into this class :



using (var fileStream = File.Open("test.xml", FileMode.Open))
{
    XmlSerializer serializer = new XmlSerializer(typeof(MyDocument));
    var myDocument = (MyDocument)serializer.Deserialize(fileStream);

    Console.WriteLine($"My Property : {myDocument.MyProperty}");
    Console.WriteLine($"My Attribute : {myDocument.MyAttributeProperty.Value}");

    foreach(var item in myDocument.MyList)
    {
        Console.WriteLine(item);
    }
}

No messing about trying to get namespaces right, no trying to work out the
correct XPath, it just works. I think once you start using XMLSerializer, you
will wonder why you ever bothered trying to manually read out XML documents
again.

Now there is a big caveat. If you don’t really care about the bulk of the
document and you are just trying to get a really deep element, it can be painful
creating these huge models and classes just go get a single element.

Overall, in 99.9% of cases, try and use XMLSerializer to parse XML. It’s less
brittle than other options and follows a very similar “pattern” to that of JSON
serialization meaning anyone who has worked with one, can work with the other.


Knapsack Algorithm In C# .NET
Using Azure CosmosDB With .NET Core


7 THOUGHTS ON “HOW TO PARSE XML IN C# .NET”

 1. Burstx
    April 30, 2020 at 9:12 pm
    
    What about LinqToXml and XDocument.Load() ?
    
    Reply
    * Wade
      May 1, 2020 at 9:00 am
      
      True. I haven’t used XDocument that much. The Linq is nice but you’re
      still plucking the elements similar to XMLDocument. I also found the Linq
      wasn’t as straight forward (e.g. You still have to use XML specific things
      like .Descendents() etc). It’s an option for sure, but I think if you want
      to pluck a single element, using XPath (With XMLDocument or XDocument)
      would be the way to go, and if you wanted the entire document
      XMLSerializer would be the way to go.
      
      Reply
      
    
 2. Rick Currey
    October 25, 2020 at 11:52 pm
    
    Why not use the edit->paste special->paste xml as class option to have VS
    build your classes for you?
    
    Reply
    * AndrzejM
      November 23, 2020 at 12:27 am
      
      paste xml as class it’s a quite nice idea but done with a “good enough”
      approach which means that some real world more complicated XMLs end with:
      
      —————————
      Microsoft Visual Studio
      —————————
      Paste XML As Classes The operation failed due to
      System.NullReferenceException: Object reference not set to an instance of
      an object.
      —————————
      OK
      —————————
      
      Reply
      
    
 3. Luis Miranda
    February 14, 2021 at 4:15 am
    
    In your opinion, is XMLSerializer good enough when dealing with files larget
    than 1 GB? Or would you recomend as you said in the post, the use of
    XMLReader
    
    Reply
    * Wade
      February 14, 2021 at 7:45 am
      
      For a 1GB file, streaming the file is the correct option (So XMLReader).
      You could try XMLSerializer though, I’ve definitely opened some rather
      large files using XMLSerializer without issue.
      
      Reply
      
    
 4. Nick P
    July 2, 2021 at 2:55 am
    
    Just come across this Wade – absolutely spot on and chimes exactly with my
    experience, especially with the dreaded Namespaces! Great article
    
    Reply
    


LEAVE A COMMENT CANCEL REPLY

Comment

Name Email Website

Save my name, email, and website in this browser for the next time I comment.





Δ

Sponsored Content



Search for:





RECENT POSTS

 * C# HashSet: A Comprehensive Guide
 * C# Extension Methods: Simplifying Code and Boosting Efficiency
 * C# Optional Parameters: Simplify Your Code with Flexibility
 * .NET Interview Questions: Tips and Examples for Success
 * C# vs Java: A Comprehensive Comparison


POPULAR POSTS

 * Creating And Validating JWT Tokens In C# .NET
 * Using User Secrets Configuration In .NET
 * Reading Excel Files In C# .NET
 * How To Parse XML In C# .NET
 * Fixing JSON Self Referencing Loop Exceptions




Our Story

Study through a pre-planned curriculum designed to help you fast-track your
DotNet career and learn from the world’s best collection of DotNet Resources.

Find us on social media:



As Featured On

Our Site

Home

Privacy

Contact

Contact: wade@dotnetcoretutorials.com | Phone Number: (973) 916-2695 | Address:
288 Rosa Parks Blvd, Paterson, New Jersey 07501, USA



Disclaimer: Efforts are made to maintain reliable data on all information
presented. However, this information is provided without warranty. Users should
always check the offer provider’s official website for current terms and
details. Our site receives compensation from many of the offers listed on the
site. Along with key review factors, this compensation may impact how and where
products appear across the site (including, for example, the order in which they
appear). Our site does not include the entire universe of available offers.
Editorial opinions expressed on the site are strictly our own and are not
provided, endorsed, or approved by advertisers.

2022 © DotNetCoreTutorials All rights reserved.


Close

dotnetcoretutorials.com Open in urlscan Pro 2606:4700:3036::6815:2368 Public Scan

Form analysis 3 forms found in the DOM

GET https://dotnetcoretutorials.com/

POST https://dotnetcoretutorials.com/wp-comments-post.php

GET https://dotnetcoretutorials.com/

Text Content

dotnetcoretutorials.com Open in urlscan Pro
2606:4700:3036::6815:2368 Public Scan

Form analysis
3 forms found in the DOM