FTP

The FTP connector provides Apache Pekko Stream sources to connect to FTP, FTPs and SFTP servers. Currently, two kinds of sources are provided:

  • one for browsing or traversing the server recursively and,
  • another for retrieving files as a stream of bytes.
Project Info: Apache Pekko Connectors FTP
Artifact
org.apache.pekko
pekko-connectors-ftp
1.0.2
JDK versions
OpenJDK 8
OpenJDK 11
OpenJDK 17
Scala versions2.13.14, 2.12.20, 3.3.3
JPMS module namepekko.stream.connectors.ftp
License
API documentation
Forums
Release notesGitHub releases
IssuesGithub issues
Sourceshttps://github.com/apache/pekko-connectors

Artifacts

sbt
val PekkoVersion = "1.0.3"
libraryDependencies ++= Seq(
  "org.apache.pekko" %% "pekko-connectors-ftp" % "1.0.2",
  "org.apache.pekko" %% "pekko-stream" % PekkoVersion
)
Maven
<properties>
  <pekko.version>1.0.3</pekko.version>
  <scala.binary.version>2.13</scala.binary.version>
</properties>
<dependencies>
  <dependency>
    <groupId>org.apache.pekko</groupId>
    <artifactId>pekko-connectors-ftp_${scala.binary.version}</artifactId>
    <version>1.0.2</version>
  </dependency>
  <dependency>
    <groupId>org.apache.pekko</groupId>
    <artifactId>pekko-stream_${scala.binary.version}</artifactId>
    <version>${pekko.version}</version>
  </dependency>
</dependencies>
Gradle
def versions = [
  PekkoVersion: "1.0.3",
  ScalaBinary: "2.13"
]
dependencies {
  implementation "org.apache.pekko:pekko-connectors-ftp_${versions.ScalaBinary}:1.0.2"
  implementation "org.apache.pekko:pekko-stream_${versions.ScalaBinary}:${versions.PekkoVersion}"
}

The table below shows direct dependencies of this module and the second tab shows all libraries it depends on transitively.

Configuring the connection settings

In order to establish a connection with the remote server, you need to provide a specialized version of a RemoteFileSettings instance. It’s specialized as it depends on the kind of server you’re connecting to: FTP, FTPs or SFTP.

Scala
sourceval ftpSettings = FtpSettings
  .create(InetAddress.getByName(HOSTNAME))
  .withPort(PORT)
  .withCredentials(CREDENTIALS)
  .withBinary(true)
  .withPassiveMode(true)
  // only useful for debugging
  .withConfigureConnection((ftpClient: FTPClient) => {
    ftpClient.addProtocolCommandListener(new PrintCommandListener(new PrintWriter(System.out), true))
  })
Java
sourceimport org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.javadsl.Source;
import org.apache.commons.net.PrintCommandListener;
import org.apache.commons.net.ftp.FTPClient;
import java.net.InetAddress;

FtpSettings ftpSettings =
    FtpSettings.create(InetAddress.getByName(HOSTNAME))
        .withPort(PORT)
        .withCredentials(CREDENTIALS)
        .withBinary(true)
        .withPassiveMode(true)
        // only useful for debugging
        .withConfigureConnectionConsumer(
            (FTPClient ftpClient) -> {
              ftpClient.addProtocolCommandListener(
                  new PrintCommandListener(new PrintWriter(System.out), true));
            });

The configuration above will create an anonymous connection with a remote FTP server in passive mode. For both FTPs and SFTP servers, you will need to provide the specialized versions of these settings: FtpsSettings or SftpSettings respectively.

The example demonstrates optional use of configureConnection option available on FTP and FTPs clients. Use it to configure any custom parameters the server may require, such as explicit or implicit data transfer encryption.

For non-anonymous connection, please provide an instance of NonAnonFtpCredentials instead.

For connection via a proxy, please provide an instance of java.net.Proxy by using the withProxy method.

For connection using a private key, please provide an instance of SftpIdentity to SftpSettings.

In order to use a custom SSH client for SFTP please provide an instance of SSHClient.

Scala
sourceimport org.apache.pekko.stream.connectors.ftp.scaladsl.{ Sftp, SftpApi }
import net.schmizz.sshj.{ DefaultConfig, SSHClient }

val sshClient: SSHClient = new SSHClient(new DefaultConfig)
val configuredClient: SftpApi = Sftp(sshClient)
Java
source
import org.apache.pekko.stream.connectors.ftp.javadsl.Sftp; import org.apache.pekko.stream.connectors.ftp.javadsl.SftpApi; import net.schmizz.sshj.DefaultConfig; import net.schmizz.sshj.SSHClient; public class ConfigureCustomSSHClient { public ConfigureCustomSSHClient() { SSHClient sshClient = new SSHClient(new DefaultConfig()); SftpApi sftp = Sftp.create(sshClient); } }

Improving SFTP throughput

For SFTP connections allowing more than one unconfirmed read request to be sent by the client you can use withMaxUnconfirmedReads on SftpSettings
The command-line tool sftp uses a value of 64 by default. This can significantly improve throughput by reducing the impact of latency.

Scala
sourceimport org.apache.pekko
import pekko.stream.IOResult
import pekko.stream.connectors.ftp.scaladsl.Sftp
import pekko.stream.scaladsl.Source
import pekko.util.ByteString

import scala.concurrent.Future

def retrieveFromPath(path: String, settings: SftpSettings): Source[ByteString, Future[IOResult]] =
  Sftp.fromPath(path, settings.withMaxUnconfirmedReads(64))
Java
source
import org.apache.pekko.stream.IOResult; import org.apache.pekko.stream.connectors.ftp.SftpSettings; import org.apache.pekko.stream.connectors.ftp.javadsl.Sftp; import org.apache.pekko.stream.javadsl.Source; import org.apache.pekko.util.ByteString; import java.util.concurrent.CompletionStage; public class SftpRetrievingExample { public Source<ByteString, CompletionStage<IOResult>> retrieveFromPath( String path, SftpSettings settings) throws Exception { return Sftp.fromPath(path, settings.withMaxUnconfirmedReads(64)); } }

Traversing a remote FTP folder recursively

In order to traverse a remote folder recursively, you need to use the ls method in the FTP API:

Scala
sourceimport org.apache.pekko
import pekko.NotUsed
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.stream.scaladsl.Source

def listFiles(basePath: String, settings: FtpSettings): Source[FtpFile, NotUsed] =
  Ftp.ls(basePath, settings)
Java
sourceimport org.apache.pekko.actor.ActorSystem;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;

public class FtpTraversingExample {

  public void listFiles(String basePath, FtpSettings settings, ActorSystem system)
      throws Exception {
    Ftp.ls(basePath, settings)
        .runForeach(ftpFile -> System.out.println(ftpFile.toString()), system);
  }
}

This source will emit FtpFile elements with no significant materialization.

For both FTPs and SFTP servers, you will need to use the FTPs and SFTP API respectively.

Retrieving files

In order to retrieve a remote file as a stream of bytes, you need to use the fromPath method in the FTP API:

Scala
sourceimport org.apache.pekko
import pekko.stream.IOResult
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.stream.scaladsl.Source
import pekko.util.ByteString

import scala.concurrent.Future

def retrieveFromPath(path: String, settings: FtpSettings): Source[ByteString, Future[IOResult]] =
  Ftp.fromPath(path, settings)
Java
sourceimport org.apache.pekko.stream.IOResult;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.javadsl.Source;
import org.apache.pekko.util.ByteString;
import java.util.concurrent.CompletionStage;

public class FtpRetrievingExample {

  public Source<ByteString, CompletionStage<IOResult>> retrieveFromPath(
      String path, FtpSettings settings) throws Exception {
    return Ftp.fromPath(path, settings);
  }
}

This source will emit ByteString elements and materializes to Future in Scala API and CompletionStage in Java API of IOResult when the stream finishes.

For both FTPs and SFTP servers, you will need to use the FTPs and SFTP API respectively.

Writing files

In order to store a remote file from a stream of bytes, you need to use the toPath method in the FTP API:

Scala
sourceimport org.apache.pekko
import pekko.stream.IOResult
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.util.ByteString
import scala.concurrent.Future

val result: Future[IOResult] = Source
  .single(ByteString("this is the file contents"))
  .runWith(Ftp.toPath("file.txt", ftpSettings))

// Create a gzipped target file
import org.apache.pekko.stream.scaladsl.Compression
val result: Future[IOResult] = Source
  .single(ByteString("this is the file contents" * 50))
  .via(Compression.gzip)
  .runWith(Ftp.toPath("file.txt.gz", ftpSettings))
Java
sourceimport org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.IOResult;
import org.apache.pekko.stream.connectors.testkit.javadsl.LogCapturingJunit4;
import org.apache.pekko.stream.javadsl.Compression;
import org.apache.pekko.stream.testkit.javadsl.StreamTestKit;
import org.apache.pekko.util.ByteString;
import java.util.concurrent.CompletionStage;

CompletionStage<IOResult> result =
    Source.single(ByteString.fromString("this is the file contents"))
        .runWith(Ftp.toPath("file.txt", ftpSettings), materializer);

// Create a gzipped target file
CompletionStage<IOResult> result =
    Source.single(ByteString.fromString("this is the file contents"))
        .via(Compression.gzip())
        .runWith(Ftp.toPath("file.txt.gz", ftpSettings), materializer);

This sink will consume ByteString elements and materializes to Future in Scala API and CompletionStage in Java API of IOResult when the stream finishes.

For both FTPs and SFTP servers, you will need to use the FTPs and SFTP API respectively.

Removing files

In order to remove a remote file, you need to use the remove method in the FTP API:

Scala
sourceimport org.apache.pekko
import pekko.stream.IOResult
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.stream.scaladsl.Sink

import scala.concurrent.Future

def remove(settings: FtpSettings): Sink[FtpFile, Future[IOResult]] =
  Ftp.remove(settings)
Java
sourceimport org.apache.pekko.stream.IOResult;
import org.apache.pekko.stream.connectors.ftp.FtpFile;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.javadsl.Sink;
import java.util.concurrent.CompletionStage;

public class FtpRemovingExample {

  public Sink<FtpFile, CompletionStage<IOResult>> remove(FtpSettings settings) throws Exception {
    return Ftp.remove(settings);
  }
}

This sink will consume FtpFile elements and materializes to Future in Scala API and CompletionStage in Java API of IOResult when the stream finishes.

Moving files

In order to move a remote file, you need to use the move method in the FTP API. The move method takes a function to calculate the path to which the file should be moved based on the consumed FtpFile.

Scala
sourceimport org.apache.pekko
import pekko.stream.IOResult
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.stream.scaladsl.Sink

import scala.concurrent.Future

def move(destinationPath: FtpFile => String, settings: FtpSettings): Sink[FtpFile, Future[IOResult]] =
  Ftp.move(destinationPath, settings)
Java
sourceimport org.apache.pekko.stream.IOResult;
import org.apache.pekko.stream.connectors.ftp.FtpFile;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.javadsl.Sink;
import java.util.concurrent.CompletionStage;
import java.util.function.Function;

public class FtpMovingExample {

  public Sink<FtpFile, CompletionStage<IOResult>> move(
      Function<FtpFile, String> destinationPath, FtpSettings settings) throws Exception {
    return Ftp.move(destinationPath, settings);
  }
}

This sink will consume FtpFile elements and materializes to Future in Scala API and CompletionStage in Java API of IOResult when the stream finishes.

Typical use-case for this would be listing files from a ftp location, do some processing and move the files when done. An example of this use case can be found below.

Creating directory

In order to create a directory the user has to specify a parent directory (also known as base path) and directory’s name.

Apache Pekko Connectors provides a materialized API mkdirAsync (based on FutureCompletion Stage) and unmaterialized API mkdir (using Sources) to let the user choose when the action will be executed.

Scala
source
import org.apache.pekko import pekko.NotUsed import pekko.stream.scaladsl.Source import pekko.stream.connectors.ftp.scaladsl.Ftp import pekko.Done def mkdir(basePath: String, directoryName: String, settings: FtpSettings): Source[Done, NotUsed] = Ftp.mkdir(basePath, directoryName, settings)
Java
sourceimport org.apache.pekko.Done;
import org.apache.pekko.NotUsed;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.javadsl.Source;

public class FtpMkdirExample {
  public Source<Done, NotUsed> mkdir(
      String parentPath, String directoryName, FtpSettings settings) {
    return Ftp.mkdir(parentPath, directoryName, settings);
  }
}

Please note that to include a subdirectory in result of ls the emitTraversedDirectories has to be set to true.

Example: downloading files from an FTP location and move the original files

Scala
sourceimport java.nio.file.Files

import org.apache.pekko
import pekko.NotUsed
import pekko.stream.connectors.ftp.scaladsl.Ftp
import pekko.stream.scaladsl.{ FileIO, RunnableGraph }

def processAndMove(sourcePath: String,
    destinationPath: FtpFile => String,
    settings: FtpSettings): RunnableGraph[NotUsed] =
  Ftp
    .ls(sourcePath, settings)
    .flatMapConcat(ftpFile => Ftp.fromPath(ftpFile.path, settings).map((_, ftpFile)))
    .alsoTo(FileIO.toPath(Files.createTempFile("downloaded", "tmp")).contramap(_._1))
    .to(Ftp.move(destinationPath, settings).contramap(_._2))
Java
sourceimport org.apache.pekko.NotUsed;
import org.apache.pekko.japi.Pair;
import org.apache.pekko.stream.connectors.ftp.FtpFile;
import org.apache.pekko.stream.connectors.ftp.FtpSettings;
import org.apache.pekko.stream.connectors.ftp.javadsl.Ftp;
import org.apache.pekko.stream.javadsl.FileIO;
import org.apache.pekko.stream.javadsl.RunnableGraph;

import java.nio.file.Files;
import java.util.function.Function;

public class FtpProcessAndMoveExample {

  public RunnableGraph<NotUsed> processAndMove(
      String sourcePath, Function<FtpFile, String> destinationPath, FtpSettings settings)
      throws Exception {
    return Ftp.ls(sourcePath, settings)
        .flatMapConcat(
            ftpFile ->
                Ftp.fromPath(ftpFile.path(), settings).map(data -> new Pair<>(data, ftpFile)))
        .alsoTo(FileIO.toPath(Files.createTempFile("downloaded", "tmp")).contramap(Pair::first))
        .to(Ftp.move(destinationPath, settings).contramap(Pair::second));
  }
}

Running the example code

The code in this guide is part of runnable tests of this project. You are welcome to browse the code, edit and run it in sbt.

```
docker compose up -d ftp sftp
sbt
> ftp/test
```
Warning

When using the SFTP API, take into account that JVM relies on /dev/random for random number generation by default. This might potentially block the process on some operating systems as /dev/random waits for a certain amount of entropy to be generated on the host machine before returning a result. In such case, please consider providing the parameter -Djava.security.egd = file:/dev/./urandom into the execution context. Further information can be found here.