Step 1. Add the JitPack repository to your build file
Add it in your root settings.gradle at the end of repositories:
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url 'https://jitpack.io' }
}
}
Add it in your settings.gradle.kts at the end of repositories:
dependencyResolutionManagement {
repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)
repositories {
mavenCentral()
maven { url = uri("https://jitpack.io") }
}
}
Add to pom.xml
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
Add it in your build.sbt at the end of resolvers:
resolvers += "jitpack" at "https://jitpack.io"
Add it in your project.clj at the end of repositories:
:repositories [["jitpack" "https://jitpack.io"]]
Step 2. Add the dependency
dependencies {
implementation 'com.github.xerial:snappy-java:1.1.10.7'
}
dependencies {
implementation("com.github.xerial:snappy-java:1.1.10.7")
}
<dependency>
<groupId>com.github.xerial</groupId>
<artifactId>snappy-java</artifactId>
<version>1.1.10.7</version>
</dependency>
libraryDependencies += "com.github.xerial" % "snappy-java" % "1.1.10.7"
:dependencies [[com.github.xerial/snappy-java "1.1.10.7"]]
snappy-java is a Java port of the snappy, a fast C++ compresser/decompresser developed by Google.
float[]
, double[]
, int[]
, short[]
, long[]
, etc.)
BitShuffle
) before compressionos.name
and os.arch
).org.xerial.snappy.Snappy
.Snappy's main target is very high-speed compression/decompression with reasonable compression size. So the compression ratio of snappy-java is modest and about the same as LZF
(ranging 20%-100% according to the dataset).
Here are some benchmark results, comparing
snappy-java and the other compressors
LZO-java
/LZF
/QuickLZ
/Gzip
/Bzip2
. Thanks Tatu Saloranta @cotowncoder for providing the benchmark suite.
The current stable version is available from here:
Snappy-java is available from Maven's central repository. Add the following dependency to your pom.xml:
<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
<version>(version)</version>
<type>jar</type>
<scope>compile</scope>
</dependency>
libraryDependencies += "org.xerial.snappy" % "snappy-java" % "(version)"
First, import org.xerial.snapy.Snappy
in your Java code:
import org.xerial.snappy.Snappy;
Then use Snappy.compress(byte[])
and Snappy.uncompress(byte[])
:
String input = "Hello snappy-java! Snappy-java is a JNI-based wrapper of "
+ "Snappy, a fast compresser/decompresser.";
byte[] compressed = Snappy.compress(input.getBytes("UTF-8"));
byte[] uncompressed = Snappy.uncompress(compressed);
String result = new String(uncompressed, "UTF-8");
System.out.println(result);
In addition, high-level methods (Snappy.compress(String)
, Snappy.compress(float[] ..)
etc. ) and low-level ones (e.g. Snappy.rawCompress(.. )
, Snappy.rawUncompress(..)
, etc.), which minimize memory copies, can be used.
Stream-based compressor/decompressor SnappyOutputStream
/SnappyInputStream
are also available for reading/writing large data sets. SnappyFramedOutputStream
/SnappyFramedInputStream
can be used for the framing format.
The original Snappy format definition did not define a file format. It later added
a "framing" format to define a file format, but by this point major software was
already using an industry standard instead -- represented in this library by the
SnappyOutputStream
and SnappyInputStream
methods.
For interoperability with other libraries, check that compatible formats are used. Note that not all libraries support all variants.
SnappyOutputStream
and SnappyInputStream
use [magic header:16 bytes]([block size:int32][compressed data:byte array])*
format. You can read the result of Snappy.compress
with SnappyInputStream
, but you cannot read the compressed data generated by SnappyOutputStream
with Snappy.uncompress
.SnappyHadoopCompatibleOutputStream
does not emit a file header but write out the current block size as a preemble to each block| Write\Read | Snappy.uncompress
| SnappyInputStream
| SnappyFramedInputStream
| org.apache.hadoop.io.compress.SnappyCodec
|
| --------------- |:-------------------:|:------------------:|:-----------------------:|:-------------------------------------------:|
| Snappy.compress
| ok | ok | x | x |
| SnappyOutputStream
| x | ok | x | x |
| SnappyFramedOutputStream
| x | x | ok | x |
| SnappyHadoopCompatibleOutputStream
| x | x | x | ok |
BitShuffle is an algorithm that reorders data bits (shuffle) for efficient compression (e.g., a sequence of integers, float values, etc.). To use BitShuffle routines, import org.xerial.snapy.BitShuffle
:
import org.xerial.snappy.BitShuffle;
int[] data = new int[] {1, 3, 34, 43, 34};
byte[] shuffledByteArray = BitShuffle.shuffle(data);
byte[] compressed = Snappy.compress(shuffledByteArray);
byte[] uncompressed = Snappy.uncompress(compressed);
int[] result = BitShuffle.unshuffleIntArray(uncompress);
System.out.println(result);
Shuffling and unshuffling of primitive arrays (e.g., short[]
, long[]
, float[]
, double[]
, etc.) are supported. See Javadoc for the details.
If you have snappy-java-(VERSION).jar in the current directory, use -classpath
option as follows:
$ javac -classpath ".;snappy-java-(VERSION).jar" Sample.java # in Windows
or
$ javac -classpath ".:snappy-java-(VERSION).jar" Sample.java # in Mac or Linux
Post bug reports or feature request to the Issue Tracker: https://github.com/xerial/snappy-java/issues
Public discussion forum is here: Xerial Public Discussion Group
snappy-java uses sbt (simple build tool for Scala) as a build tool. Here is a simple usage
$ ./sbt # enter sbt console
> ~test # run tests upon source code change
> ~testOnly # run tests that matches a given name pattern
> publishM2 # publish jar to $HOME/.m2/repository
> package # create jar file
> findbugs # Produce findbugs report in target/findbugs
> jacoco:cover # Report the code coverage of tests to target/jacoco folder
If you need to see detailed debug messages, launch sbt with -Dloglevel=debug
option:
$ ./sbt -Dloglevel=debug
For the details of sbt usage, see my blog post: Building Java Projects with sbt
See the build instruction. Building from the source code is an option when your OS platform and CPU architecture is not supported. To build snappy-java, you need Git, JDK (1.6 or higher), g++ compiler (mingw in Windows) etc.
$ git clone https://github.com/xerial/snappy-java.git
$ cd snappy-java
$ make
When building on Solaris, use gmake
:
$ gmake
A file target/snappy-java-$(version).jar
is the product additionally containing the native library built for your platform.
GitHub action [https://github.com/xerial/snappy-java/blob/master/.github/workflows/release.yml] will publish a new relase to Maven Central (Sonatype) when a new tag vX.Y.Z is pushed.
Simply put the snappy-java's jar to WEB-INF/lib folder of your web application. Usual JNI-library specific problem no longer exists since snappy-java version 1.0.3 or higher can be loaded by multiple class loaders.
Prepare org-xerial-snappy.properties file (under the root path of your library) in Java's property file format. Here is a list of the available properties:
Snappy-java is developed by Taro L. Saito. Twitter @taroleo