As you may know, Python has a really nice XML library called ElementTree, but it has a few quirks:
import philologic.shlaxtree as st
for filename in sys.argv[1:]:
file = codecs.open(filename,"r","utf-8")
root = st.parse(file)
header = root.find("teiHeader")
Not bad for 10 lines, no? What's really cool is that you can modify trees, nodes, and fragments before writing them out, with neat recursive functions and what not. I've been using it for converting old SGML dictionaries to TEI--once you get the hang of it, it's much easier than regular expressions, and much easier to maintain and modify as well.