That's how it is supposed to be taught. I have been browsing the courses on how to do it and they all are complicated. Thankfully found this video. Thanks a lot. Great job!
Thank you, Francesco, for taking the time to review this library's different functions. You have greatly helped me finish a much-needed script for our localization engineering tasks. Notably, adding text to an existing tag saved the day.
Thanks a lot for the great tutorial. Your approach to XML parsing was spot-on for me and it was exactly what I was looking for to get started on XML parsing.
Hi Francesco. Thanks for the great video! I ran into an error after editing my xml file. I tried to view the entire file to make sure my changes were made with ET.dump(tree) and I always get "AttributeError: 'str' object has no attribute 'items'" I'm testing with Jupyter notebook and when I restart the kernel, ET.dump works just fine before I make changes to the file. Any idea on how to fix this? I'm new to Python.
Hi Francesco! I have been trying to do something with elementtree for several days but it is impossible for me ... And it gives me the feeling that it is very simple. I want to make a little script that adds a child element only if it doesn't already have it. Imagine that the document lacks year to panama. My script would go through the xml document and add only the year to Panama ... Could you give me some idea please? Many thanks.
Is there a way to change sub-element instead of the whole element string? let's say for example that I want to change W with SW but not the name, and I need to do it in a loop so I can't put the name string inside as it changes anytime, is there a way to call the specific sub element?
Can you show us how to parse a Tableau dashboard file (*.twb)? It's an XML file, Tableau just renamed it. I am trying to create a data dictionary from the .twb file.
hii franseco great video thanks i need small suggestion here let's saya 100 so in this i need output like KTOPL 100 here i need tag and value both how we can get can u please explian
Hi Francesco, i'm getting error while parsing xml file since it is having special words. kindly hep me to avoid this error. Error : xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 277, column 366
If you are sure the file you have is a valid xml (there are online tools to help you there), then what comes to mind is incorrect encoding. Check the documentation here: docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.XMLParser
docs.python.org/2.7/library/xml.etree.elementtree.html it's a built in library no need to install. I recommend you start to figure out the restrictions because 2.7 is deprecated. 3.8 is now also available with Anaconda. Possibly some of the code I used will not work on 2.7, bear that in mind.
Let's take it in steps. I'm assuming you want to extract 'Entity', 'EntityTwo', 'EntityThree' from the element (...let me know if i misunderstood your question). The way it's formatted it contains 2 elements ( and ) as well as the piece of text you want to extract. If you just use findall() and use 'text' you get None back, what you want to use in this case is 'tail' instead. I've included a sample code here: gist.github.com/fcento100/74b8691af014a8126f8e9ca2ff03c6ea
@@fcento Yes, you understood me good. Ohhhh with tail .Well, i checked it but with other xml didn't compile :( , instead of that i put findall('.//cp', ns) and print elm.tail, with that we got the text. I like more your solution but with other xml didn't compile :(((((.This is the error that i got: elmtail = elm.tail.strip() AttributeError: 'NoneType' object has no attribute 'strip'
Apologies for not catching the 'NoneType' error, effectively 'tail' returns None if it doesn't find anything rather than an empty string. It's fixed now in this version: gist.github.com/fcento100/11847ad0d8d42eec6c1dc42de897b842 with an if statement to catch it. The reason i wasn't getting this error was because i copied pasted from your message and since it was formatted, 'tail' returned ' ' and '\t' (which are the string representation of new-line and tab) where it should have returned None, hence why i was able to run the strip command everywhere without error. In the new code i posted I've shown 2 methods of getting at that piece of data; in your sample xml "Entity" etc.. is the tail of ; root.findall('.//visio:Text/',ns) and root.findall('.//visio:cp',ns) do similar things. The only difference is that using './/visio:Text/' in method 1 will also extract the tail for if is available, which may be undesirable! In that case './/visio:cp' like you suggested is the way to go.
That's how it is supposed to be taught. I have been browsing the courses on how to do it and they all are complicated. Thankfully found this video. Thanks a lot. Great job!
Thank you, Francesco, for taking the time to review this library's different functions. You have greatly helped me finish a much-needed script for our localization engineering tasks. Notably, adding text to an existing tag saved the day.
Thanks a lot for the great tutorial. Your approach to XML parsing was spot-on for me and it was exactly what I was looking for to get started on XML parsing.
Thanks a lot for this video. I couldn't grasp the concepts properly even after reading from books. This video made it look like piece of cake.
Ciao a Tutti. Ho provato da me stesso cosi come lo hai fatto tu. Dopo di che ho trovato il tuo video. Grazzie mille per averlo fatto.
Excellent man!
This is what I was looking for :)
Grazie Mille!!! That was exactly what I was looking for and all well explained!!!
Great video. Thanks
Awesome
thank you for this
Increase your font size before doing tutorials. its quite complicated to read texts. anyway goodjob
nice vid thanks
Hi Francesco. Thanks for the great video! I ran into an error after editing my xml file. I tried to view the entire file to make sure my changes were made with ET.dump(tree) and I always get "AttributeError: 'str' object has no attribute 'items'" I'm testing with Jupyter notebook and when I restart the kernel, ET.dump works just fine before I make changes to the file. Any idea on how to fix this? I'm new to Python.
Hi Arnold, can you share the code?
@@fcento absolutely. Is there an email address I can send it to? I’d like to include the payload as well for reference
Thanks for this video, I needed to parse xml from a variable instead of a file and found this : xml_data_tree = ET.fromstring(received_packet)
Hi Francesco! I have been trying to do something with elementtree for several days but it is impossible for me ... And it gives me the feeling that it is very simple. I want to make a little script that adds a child element only if it doesn't already have it. Imagine that the document lacks year to panama. My script would go through the xml document and add only the year to Panama ... Could you give me some idea please?
Many thanks.
I've just made a video about it: ua-cam.com/video/5BrVPpOifto/v-deo.html
Is there a way to change sub-element instead of the whole element string? let's say for example that I want to change W with SW but not the name, and I need to do it in a loop so I can't put the name string inside as it changes anytime, is there a way to call the specific sub element?
Can you show us how to parse a Tableau dashboard file (*.twb)? It's an XML file, Tableau just renamed it. I am trying to create a data dictionary from the .twb file.
thanks a lot
Happy to help
C'e' qualche modo build in di non essere costretto di formattare questo xml?
Can this be done by Beautifulsoup library?
hii franseco great video thanks i need small suggestion here let's saya 100 so in this i need output like KTOPL 100 here i need tag and value both how we can get can u please explian
Merge XML files using python,can you please make video on this top
can you mass edit multiple files?
Hi Francesco,
i'm getting error while parsing xml file since it is having special words. kindly hep me to avoid this error.
Error : xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 277, column 366
If you are sure the file you have is a valid xml (there are online tools to help you there), then what comes to mind is incorrect encoding. Check the documentation here: docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.XMLParser
How to install xmltree in python 2.7.5
I am not able to upgrade due to restriction
docs.python.org/2.7/library/xml.etree.elementtree.html it's a built in library no need to install. I recommend you start to figure out the restrictions because 2.7 is deprecated. 3.8 is now also available with Anaconda. Possibly some of the code I used will not work on 2.7, bear that in mind.
Go raibh céad maith agat, a Francesco. Rud a bhí de dhíth orm le fada. Pádraig Mac Con Uladh
Was not expecting irish in this chat XD
looking forword...
Hi i'm trying to get the text of every tag named , but inside every tag has this: , some idea to extract/ the content of the tags?:
1
1
1
0
#000000
1
0.010000
#000000
#000000
1
#000000
0.000000
0.590551
0.000000
0.000000
0
0.000000
0
0.000000
1
#000000
0
1.000000
0.166667
-1

0
-1.200000
1.651575
0.748031
0
0
0
0.708661
3.720472
6.023622
1.612205
#000000
#FFFFFF
1
#000000
1
#000000
1
0.039370
0
0
0
0
0.000000
0.000000
1.612205
0.000000
1.612205
-0.708661
0.000000
-0.708661
0.000000
0.000000
0
0
0
0.247563
3.889961
5.511811
1.273228
#000000
0
1.000000
0.247563
1
Entity
0
0
0
0.708661
3.720472
6.023622
1.612205
#000000
#FFFFFF
1
#000000
1
#000000
1
0.039370
0
0
0
0
0.000000
0.000000
1.612205
0.000000
1.612205
-0.708661
0.000000
-0.708661
0.000000
0.000000
0
0
0
0.247563
3.889961
5.511811
1.273228
#000000
0
1.000000
0.247563
1
EntityTwo
0
0
0
0.708661
3.720472
6.023622
1.612205
#000000
#FFFFFF
1
#000000
1
#000000
1
0.039370
0
0
0
0
0.000000
0.000000
1.612205
0.000000
1.612205
-0.708661
0.000000
-0.708661
0.000000
0.000000
0
0
0
0.247563
3.889961
5.511811
1.273228
#000000
0
1.000000
0.247563
1
EntityThree
Let's take it in steps. I'm assuming you want to extract 'Entity', 'EntityTwo', 'EntityThree' from the element (...let me know if i misunderstood your question). The way it's formatted it contains 2 elements ( and ) as well as the piece of text you want to extract. If you just use findall() and use 'text' you get None back, what you want to use in this case is 'tail' instead. I've included a sample code here: gist.github.com/fcento100/74b8691af014a8126f8e9ca2ff03c6ea
i've put the xml code from your comment in a file here gist.github.com/fcento100/19cb7ae6b857c539a2c2843519239efc for convenience
@@fcento Yes, you understood me good. Ohhhh with tail .Well, i checked it but with other xml didn't compile :( , instead of that i put findall('.//cp', ns) and print elm.tail, with that we got the text. I like more your solution but with other xml didn't compile :(((((.This is the error that i got:
elmtail = elm.tail.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
Apologies for not catching the 'NoneType' error, effectively 'tail' returns None if it doesn't find anything rather than an empty string. It's fixed now in this version: gist.github.com/fcento100/11847ad0d8d42eec6c1dc42de897b842 with an if statement to catch it. The reason i wasn't getting this error was because i copied pasted from your message and since it was formatted, 'tail' returned '
' and '\t' (which are the string representation of new-line and tab) where it should have returned None, hence why i was able to run the strip command everywhere without error.
In the new code i posted I've shown 2 methods of getting at that piece of data; in your sample xml "Entity" etc.. is the tail of ; root.findall('.//visio:Text/',ns) and root.findall('.//visio:cp',ns) do similar things. The only difference is that using './/visio:Text/' in method 1 will also extract the tail for if is available, which may be undesirable! In that case './/visio:cp' like you suggested is the way to go.
@@fcento a lot of thanks for your kind help Francesco :))