论文标题
使用Google图像搜索结果进行网页分类
Web page classification with Google Image Search results
论文作者
论文摘要
在本文中,我们介绍了一种新颖的方法,该方法结合了多个神经网络结果以决定输入类别。这是第一个使用该方法进行网页分类的研究。在我们的模型中,每个元素都由多个描述性图像表示。在神经网络模型的训练过程之后,通过计算其描述性图像结果来对每个元素进行分类。我们使用Google映像搜索结果作为描述性图像将我们的想法应用于网页分类问题。我们在WebScreenshots数据集中获得了94.90%的分类率,该数据集包含4个类中的20000个网站。该方法很容易适用于类似问题。
In this paper, we introduce a novel method that combines multiple neural network results to decide the class of the input. This is the first study which used the method for web pages classification. In our model, each element is represented by multiple descriptive images. After the training process of the neural network model, each element is classified by calculating its descriptive image results. We apply our idea to the web page classification problem using Google Image Search results as descriptive images. We obtained a classification rate of 94.90% on the WebScreenshots dataset that contains 20000 web sites in 4 classes. The method is easily applicable to similar problems.