获取XML标签内容:
# cat sample.xml
<?xml version="1.0"?> <catalog> <book id="bk101"> <author>Gambardella, Matthew</author> <title>XML Developer's Guide</title> <genre>Computer</genre> <price>44.95</price> <publish_date>2000-10-01</publish_date> <description>An in-depth look at creating applications with XML.</description> </book> <book id="bk102"> <author>Ralls, Kim</author> <title>Midnight Rain</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-12-16</publish_date> <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description> </book> <book id="bk103"> <author>Corets, Eva</author> <title>Maeve Ascendant</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2000-11-17</publish_date> <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description> </book> <book id="bk104"> <author>Corets, Eva</author> <title>Oberon's Legacy</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-03-10</publish_date> <description>In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant.</description> </book> <book id="bk105"> <author>Corets, Eva</author> <title>The Sundered Grail</title> <genre>Fantasy</genre> <price>5.95</price> <publish_date>2001-09-10</publish_date> <description>The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy.</description> </book> </catalog>
You want to pick up the stuff between the “<description>, </description>” tags.
The first occurrence is on a single line. The rest of them span multiple lines and you want the newlines to be preserved. I shall assume that you want the whitespaces to be preserved as well.
Here’s the script –
$ $ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){print $1}' sample.xml An in-depth look at creating applications with XML. A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society. In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant. The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy. $ $
In case you want the newlines preserved, but want to remove the whitespace at the beginning, then –
$ $ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){($x = $1) =~ s/\n\s*/\n/g; print $x}' sample.xml An in-depth look at creating applications with XML. A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society. In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant. The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy. $ $
And in case you want to neither the newline nor the whitespace i.e. each chunk between “<description>” tags on a single line, then –
$ $ perl -lne 'BEGIN{undef $/} while (/<description>(.*?)<\/description>/sg){($x = $1) =~ s/\n\s*//g; print $x}' sample.xml An in-depth look at creating applications with XML. A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society. In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant. The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy. $ $