Step 1. Add the JitPack repository to your build file
Add it in your root settings.gradle at the end of repositories:
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url 'https://jitpack.io' }
}
}
Add it in your settings.gradle.kts at the end of repositories:
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url = uri("https://jitpack.io") }
}
}
Add to pom.xml
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
Add it in your build.sbt at the end of resolvers:
resolvers += "jitpack" at "https://jitpack.io"
Add it in your project.clj at the end of repositories:
:repositories [["jitpack" "https://jitpack.io"]]
Step 2. Add the dependency
dependencies {
implementation 'com.github.umjammer:vavi-util-screenscraping:1.0.16'
}
dependencies {
implementation("com.github.umjammer:vavi-util-screenscraping:1.0.16")
}
<dependency>
<groupId>com.github.umjammer</groupId>
<artifactId>vavi-util-screenscraping</artifactId>
<version>1.0.16</version>
</dependency>
libraryDependencies += "com.github.umjammer" % "vavi-util-screenscraping" % "1.0.16"
:dependencies [[com.github.umjammer/vavi-util-screenscraping "1.0.16"]]
🌏 Scrape the world!
This library screen-scrapes data from html and injects data into POJO using annotation.
@WebScraper(url = "http://foo.com/bar.html")
public class Baz {
@Target(value = "//TABLE//TR/TD[2]/DIV/text()")
String artist;
@Target(value = "//TABLE//TR/TD[4]/A/text()")
String title;
@Target(value = "//TABLE//TR/TD[4]/A/@href")
String url;
}
:
List<Baz> bazs = WebScraper.Util.scrape(Baz.class);
InputHandler
... apply any processing before parsing
Parser
XPathParser
... defaultHtmlXPathParser
... for original purposeSaxonXPathParser
... for huge xml fileJsonPathParser
... for json returnParser#foreach()
... like java collection stream
~~argument injection into WebScraper#url~~
@WebScraper(url = "http://foo.com?bar={bar}")
public static class Result {
:
List<Result> data = WebScraper.Util.scrape(Result.class, @UrlParam(bar) args[0]);
@WebScraper#encoding()
@Target
add exception handler or second, third option