The traditional methods on extracting content are representing the web page with the data structure of trees in the main memory . unfortunately , the space and time complexity is relatively high when building and searching the tree 传统的方法是将网页结构表示成树,但建树与查找树的过程中,空间与时间复杂度都过高。