Parsing XML files with Python (xml.etree.ElementTree)

fcento

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 лис 2024

КОМЕНТАРІ • 45

@bayrakmusti1 Рік тому ⁺³
That's how it is supposed to be taught. I have been browsing the courses on how to do it and they all are complicated. Thankfully found this video. Thanks a lot. Great job!
@jdvelasquezr 5 місяців тому
Thank you, Francesco, for taking the time to review this library's different functions. You have greatly helped me finish a much-needed script for our localization engineering tasks. Notably, adding text to an existing tag saved the day.
@ImtiazEbnaMannan Рік тому ⁺²
Thanks a lot for the great tutorial. Your approach to XML parsing was spot-on for me and it was exactly what I was looking for to get started on XML parsing.
@UsmanSaadat 2 роки тому ⁺⁵
Thanks a lot for this video. I couldn't grasp the concepts properly even after reading from books. This video made it look like piece of cake.
@konradp6379 21 день тому
Ciao a Tutti. Ho provato da me stesso cosi come lo hai fatto tu. Dopo di che ho trovato il tuo video. Grazzie mille per averlo fatto.
@RodrigoMontes Рік тому ⁺¹
Excellent man!
This is what I was looking for :)
@ginopeduto4264 3 роки тому ⁺⁵
Grazie Mille!!! That was exactly what I was looking for and all well explained!!!
@A_A7337 2 роки тому ⁺²
Great video. Thanks
@debasishsahoo1268 4 місяці тому
Awesome
@stanleymbah8983 2 роки тому ⁺¹
thank you for this
@arshap9351 3 роки тому ⁺⁴
Increase your font size before doing tutorials. its quite complicated to read texts. anyway goodjob
@attilioturco 9 місяців тому
nice vid thanks
@arnolda7417 3 роки тому ⁺²
Hi Francesco. Thanks for the great video! I ran into an error after editing my xml file. I tried to view the entire file to make sure my changes were made with ET.dump(tree) and I always get "AttributeError: 'str' object has no attribute 'items'" I'm testing with Jupyter notebook and when I restart the kernel, ET.dump works just fine before I make changes to the file. Any idea on how to fix this? I'm new to Python.
@fcento 3 роки тому ⁺¹
Hi Arnold, can you share the code?
@arnolda7417 3 роки тому
@@fcento absolutely. Is there an email address I can send it to? I’d like to include the payload as well for reference
@sidjjj 2 роки тому
Thanks for this video, I needed to parse xml from a variable instead of a file and found this : xml_data_tree = ET.fromstring(received_packet)
@hoscoharding7319 4 роки тому ⁺¹
Hi Francesco! I have been trying to do something with elementtree for several days but it is impossible for me ... And it gives me the feeling that it is very simple. I want to make a little script that adds a child element only if it doesn't already have it. Imagine that the document lacks year to panama. My script would go through the xml document and add only the year to Panama ... Could you give me some idea please?
Many thanks.
@fcento 4 роки тому ⁺³
I've just made a video about it: ua-cam.com/video/5BrVPpOifto/v-deo.html
@giacomocillari4448 2 роки тому
Is there a way to change sub-element instead of the whole element string? let's say for example that I want to change W with SW but not the name, and I need to do it in a loop so I can't put the name string inside as it changes anytime, is there a way to call the specific sub element?
@xst-k6 10 місяців тому
Can you show us how to parse a Tableau dashboard file (*.twb)? It's an XML file, Tableau just renamed it. I am trying to create a data dictionary from the .twb file.
@myyoutubeaccount0123_ 2 роки тому
thanks a lot
@fcento 2 роки тому
Happy to help
@konradp6379 21 день тому
C'e' qualche modo build in di non essere costretto di formattare questo xml?
@KiviliG 3 роки тому
Can this be done by Beautifulsoup library?
@vijayalakshmi8282 2 роки тому
hii franseco great video thanks i need small suggestion here let's saya 100 so in this i need output like KTOPL 100 here i need tag and value both how we can get can u please explian
@shrinivasulunandyala9269 3 роки тому
Merge XML files using python,can you please make video on this top
@markdillon9588 2 роки тому
can you mass edit multiple files?
@CinemagicMindset 2 роки тому
Hi Francesco,
i'm getting error while parsing xml file since it is having special words. kindly hep me to avoid this error.
Error : xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 277, column 366
@fcento 2 роки тому
If you are sure the file you have is a valid xml (there are online tools to help you there), then what comes to mind is incorrect encoding. Check the documentation here: docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.XMLParser
@fantasticprajwal7442 4 роки тому
How to install xmltree in python 2.7.5
I am not able to upgrade due to restriction
@fcento 4 роки тому
docs.python.org/2.7/library/xml.etree.elementtree.html it's a built in library no need to install. I recommend you start to figure out the restrictions because 2.7 is deprecated. 3.8 is now also available with Anaconda. Possibly some of the code I used will not work on 2.7, bear that in mind.
@padraigmaccu9333 3 роки тому ⁺¹
Go raibh céad maith agat, a Francesco. Rud a bhí de dhíth orm le fada. Pádraig Mac Con Uladh
@codelearnexe475 2 роки тому ⁺¹
Was not expecting irish in this chat XD
@KrishnaManohar8021 3 роки тому
looking forword...
@Gamer-mg6my 2 роки тому
Hi i'm trying to get the text of every tag named , but inside every tag has this: , some idea to extract/ the content of the tags?:

1
1
1
0

#000000
1
0.010000

#000000
#000000
1
#000000

0.000000
0.590551
0.000000
0.000000
0
0.000000
0
0.000000
1

#000000
0
1.000000
0.166667

-1

0
-1.200000

1.651575
0.748031

0
0
0
0.708661
3.720472
6.023622
1.612205

#000000
#FFFFFF
1
#000000

1
#000000
1
0.039370

0
0
0
0
0.000000
0.000000

1.612205
0.000000

1.612205
-0.708661

0.000000
-0.708661

0.000000
0.000000

0
0
0
0.247563
3.889961
5.511811
1.273228

#000000
0
1.000000
0.247563

1
Entity

0
0
0
0.708661
3.720472
6.023622
1.612205

#000000
#FFFFFF
1
#000000

1
#000000
1
0.039370

0
0
0
0
0.000000
0.000000

1.612205
0.000000

1.612205
-0.708661

0.000000
-0.708661

0.000000
0.000000

0
0
0
0.247563
3.889961
5.511811
1.273228

#000000
0
1.000000
0.247563

1
EntityTwo

0
0
0
0.708661
3.720472
6.023622
1.612205

#000000
#FFFFFF
1
#000000

1
#000000
1
0.039370

0
0
0
0
0.000000
0.000000

1.612205
0.000000

1.612205
-0.708661

0.000000
-0.708661

0.000000
0.000000

0
0
0
0.247563
3.889961
5.511811
1.273228

#000000
0
1.000000
0.247563

1
EntityThree
@fcento 2 роки тому ⁺¹
Let's take it in steps. I'm assuming you want to extract 'Entity', 'EntityTwo', 'EntityThree' from the element (...let me know if i misunderstood your question). The way it's formatted it contains 2 elements ( and ) as well as the piece of text you want to extract. If you just use findall() and use 'text' you get None back, what you want to use in this case is 'tail' instead. I've included a sample code here: gist.github.com/fcento100/74b8691af014a8126f8e9ca2ff03c6ea
@fcento 2 роки тому ⁺¹
i've put the xml code from your comment in a file here gist.github.com/fcento100/19cb7ae6b857c539a2c2843519239efc for convenience
@Gamer-mg6my 2 роки тому
@@fcento Yes, you understood me good. Ohhhh with tail .Well, i checked it but with other xml didn't compile :( , instead of that i put findall('.//cp', ns) and print elm.tail, with that we got the text. I like more your solution but with other xml didn't compile :(((((.This is the error that i got:
elmtail = elm.tail.strip()
AttributeError: 'NoneType' object has no attribute 'strip'
@fcento 2 роки тому ⁺¹
Apologies for not catching the 'NoneType' error, effectively 'tail' returns None if it doesn't find anything rather than an empty string. It's fixed now in this version: gist.github.com/fcento100/11847ad0d8d42eec6c1dc42de897b842 with an if statement to catch it. The reason i wasn't getting this error was because i copied pasted from your message and since it was formatted, 'tail' returned '
' and '\t' (which are the string representation of new-line and tab) where it should have returned None, hence why i was able to run the strip command everywhere without error.
In the new code i posted I've shown 2 methods of getting at that piece of data; in your sample xml "Entity" etc.. is the tail of ; root.findall('.//visio:Text/',ns) and root.findall('.//visio:cp',ns) do similar things. The only difference is that using './/visio:Text/' in method 1 will also extract the tail for if is available, which may be undesirable! In that case './/visio:cp' like you suggested is the way to go.
@Gamer-mg6my 2 роки тому
@@fcento a lot of thanks for your kind help Francesco :))

Наступне

Автоматичне відтворення

Parsing XML with Namespaces with Python (xml.etree.ElementTree)