[SOLVED] Parse XML on BeautifulSoup

Questions about WAPT Packaging / Requests and help regarding Wapt packages.
Forum Rules
Community Forum Rules
* English support on www.reddit.com/r/wapt
* French community support is available on this forum
* Please prefix the topic title with [RESOLVED] if it is resolved.
* Please do not edit a topic that is tagged [RESOLVED]. Open a new topic referencing the old one.
* Specify the installed WAPT version, full version, and build number (2.2.1.11957 / 2.2.2.12337 / etc.) as well as the Enterprise/Discovery edition.
* Versions 1.8.2 and earlier are no longer supported. The only questions accepted regarding version 1.8.2 are related to upgrading to a supported version (2.1, 2.2, etc.).
* Specify the server OS (Linux/Windows) and version (Debian Buster/Bullseye - CentOS 7 - Windows Server 2012/2016/2019).
* Specify the OS of the administration/package creation machine and the machine with the problematic agent, if applicable (Windows 7/10/11/Debian 11/etc.).
* Avoid asking multiple questions when opening a topic, otherwise it may be ignored. If there are multiple topics, open separate topics, preferably one after the other and not all at the same time (i.e., do not spam the forum).
* Include code snippets, screenshots, and other images directly in the post. Links to Pastebin, Bitly, and other third-party sites will be systematically removed.
* As with any community forum, support is provided voluntarily by members. If you require commercial support, you can contact Tranquil IT's sales department at 02.40.97.57.55
Locked
jmorillo
Messages: 7
Registration: Oct 16, 2024 - 3:26 p.m.

October 16, 2024 - 3:38 PM

Hello,

I need to create a "Clari Copilot" package (easy, their installer.exe works correctly with /S).
However, I'm stuck on the update_package.py function because the binary is hosted on a CDN, without a main HTML page, but I was able to find an XML page listing the releases.
In setupdevhelpers.py, there are the bs_find and bs_find_all functions (which call BeautifulSoup (bs4)) with features="html.parser" by default.
BeautifulSoup, as well as the bs_find* functions, accept feature="xml", except that BeautifulSoup needs the "lxml" Python library, which isn't present by default in WAPT's Python virtual environment, I believe.
I could create a crude parser using a workaround, but it would be better to use bs_find* and BeautifulSoup natively with XML.
Do you have any suggestions? Is there a plan to integrate this LXML library into a future release? Or perhaps I've missed something?...
Thank you very much in advance.
Sincerely,
Jordi
User avatar
blemoigne
Messages: 178
Registration: July 17, 2020 - 11:29

October 16, 2024 - 4:45 PM

Hi Jordi,
You can still parse the XML with the HTML parser (you'll get a warning). This is the case with this package: https://wapt.tranquil.it/store/fr/tis-0install
jmorillo
Messages: 7
Registration: Oct 16, 2024 - 3:26 p.m.

October 16, 2024 - 6:33 PM

Thank you so much, Bertrand!
Everything is working correctly!
Just a minor issue because the XML element was declared like this: "<Key> "

I couldn't find any results for

Code: Select all

bs_find_all('https://contoso.com/test.xml', 'Key')
You had to put Key -> key (lowercase) in

Code: Select all

bs_find_all('https://contoso.com/test.xml', 'key')
for a result to be displayed.
In any case, I will be able to finish the update_package function.
Thank you again so much
User avatar
dcardon
WAPT Expert
Messages: 1929
Registration: June 18, 2014 - 09:58
Location: Saint Sébastien sur Loire
Contact :

October 17, 2024 - 3:47 PM

Hi Jordi,
thanks for your feedback, :-)
I'm marking the topic as resolved.
Denis
Denis Cardon - Tranquil IT
Share your experiences on WAPT! Send us your blog and article URLs in the "Your Opinion of the forum, and we'll feature them on the WAPT
Locked