Migrating Serverless Solutions from Cloud to On Premise — Pitfalls and Lessons Learned

Altiv Labs
Mar 12
8 min read

Updated: May 3

At Altiv Labs, we recently helped a client bring their ambitious Internet of Things (IoT) vision to life. Built entirely on AWS, their platform seamlessly collects, processes, and monitors data from hundreds of connected devices spread across multiple locations.

The solution combines real-time data ingestion, smart message transformation and routing, and georeferenced data storage — all tied together with web-based dashboards for easy analysis and decision-making. Behind the scenes, a suite of serverless services and message queues keeps everything running smoothly, while the intuitive frontend gives teams a unified view and powerful tools to manage critical operations.

Designed for efficiency and scalability, the platform successfully supported a wide range of clients and enabled rapid growth — all powered by its cloud-native architecture.

Cloud-based Architecture running on AWS.

However, a new challenge emerged: a customer needed the solution to operate entirely offline, without any reliance on cloud services. To meet this requirement, we set out to migrate the platform to an on premise environment, guided by two key goals:

preserve as much of the existing codebase as possible.
ensure the offline version delivered the same full range of features and performance as the original cloud solution.

Our first step was to carefully rethink each component of the platform and plan how to adapt it to the new offline environment. Early on, we decided to orchestrate the entire infrastructure with Docker — a choice that significantly streamlined both deployment and long-term maintenance.

Infrastructure Setup

We began by organizing the infrastructure across two main servers:

Application Server: handled message processing and hosted all core services.
Database Server: dedicated solely to PostgreSQL, ensuring that intensive queries would not affect the performance of the rest of the system.

Since the solution would operate within an internal network, we implemented an internal DNS and generated our own security certificates. To achieve this, we created a custom Certificate Authority (CA) and installed it on all network devices, guaranteeing end-to-end authentication and encryption across the environment.

Stack Components

MQTT Broker:
For device communication, we needed an open-source, lightweight, and secure broker. We chose Mosquitto (Eclipse Foundation), which perfectly matched our requirements — especially with its robust device permission and authentication controls.
PostgreSQL with PostGIS:
This part was straightforward. We were able to reuse nearly all the database creation scripts from the cloud environment, requiring only minor adjustments to fit the on premise setup.
RabbitMQ:
To manage message flow and ensure resilience, we implemented RabbitMQ as our message queuing system. We also introduced an intermediate service, called the Bridge, to connect the MQTT and AMQP protocols.
Beyond protocol bridging, the Bridge played a key role in scaling: it distributed messages across multiple queues, allowing several consumers to process them in parallel (we will dive deeper into this later).

Migrating IoT Lambdas

With Mosquitto, PostgreSQL, and RabbitMQ in place, we moved on to migrating the AWS IoT Lambdas. In our original cloud setup, these serverless functions were event-driven: they processed incoming device messages, transformed the data, and triggered workflows based on IoT events — all in real time.

For the on premise version, we reimplemented these Lambdas as Node.js applications, replacing Lambda.invoke calls with local function calls.

The main challenge was performance: each message took around 300ms to process. Initially, everything ran in a single-threaded model with just one queue — which severely limited throughput. To boost performance, we first attempted to optimize Node.js’s event loop by processing one promise per device:

channel.consume(queueName, async (message) => {
  if (message !== null) {
    const messageJson = JSON.parse(message.content.toString());
    const mac = messageJson.topic.toUpperCase();

    if (!msgsDevices[mac]) msgsDevices[mac] = [];
    if (msgsDevices[mac].length < BATCH_SIZE) {
      msgsDevices[mac].push(message);

      if (!devicesProcessing[mac]) {
        devicesProcessing[mac] = true;
        await processDeviceQueue(mac);
        delete devicesProcessing[mac];
      }
    } else {
      channel.nack(message, false, true);
    }
  }
});

While this approach brought some improvements, it still didn’t fully take advantage of the server’s processing capacity.

Scaling with Multi-Threads and Multi-Queues

Since the server’s resources were still underutilized, we implemented a multi-threaded, multi-queue architecture. Using Node.js’s cluster module, we created one worker process per CPU core, with each worker handling its own dedicated queue.

if (cluster.isPrimary) {
  const numWorkers = numCPUs as number;
  const amqpUrl = `amqp://${process.env.AMQP_USERNAME}:${process.env.AMQP_PASSWORD}@${process.env.AMQP_HOST}`;

  for (let i = 0; i <= numWorkers; i++) {
    cluster.fork({
      AMQP_URL: amqpUrl,
      QUEUE_NAME: `mqtt-messages-${i}`,
    });
  }
  cluster.on("exit", (worker, code, signal) => {
    cluster.fork();
  });
} else {
  // worker code...
}

Message Distribution by Queues

A key challenge in this new setup was ensuring that, even with concurrent processing, messages for each device were handled sequentially.

To achieve this, we used each device’s MAC address as the basis for distribution:

we computed a SHA-256 hash of the MAC address and used it to consistently assign messages to one of the available queues — with each queue tied to a specific server thread:

function getQueueIndex(macAddress: string): number {
  const hash = crypto.createHash("sha256").update(macAddress).digest("hex");
  const hashInt = parseInt(hash.substring(0, 8), 16);
  return hashInt % NUM_QUEUES;
}

const queueIndex = getQueueIndex(macAddress);
const deviceQueue = `mqtt-messages-${queueIndex}`;

channel.sendToQueue(deviceQueue, Buffer.from(JSON.stringify(msg)), {
  persistent: true,
});

With this approach, we were finally able to fully utilize the server’s resources and reach the performance levels needed to handle the expected message volume.

**Queue and worker architecture in the on premise solution.**

Migrating API Lambdas

On the backend side, we faced the challenge of migrating the API Lambdas. To address this, we refactored the functions into a Node.js monolith using Express, aiming to preserve as much of the existing codebase as possible while also simplifying future maintenance.

Standardizing Lambda Integration

In the original cloud-based solution, each API route was implemented as a separate AWS Lambda function, following standard AWS conventions. To avoid rewriting each Lambda from scratch, we developed a middleware layer that simulates the event format typically received from AWS. This allowed the Lambdas to operate with minimal modifications.

export const handlePathParameters = (req: Request, res: Response, next: NextFunction) => {
  req.pathParameters = req.params;
  req.queryStringParameters = req.query;
  req.multiValueQueryStringParameters = req.query;
  next();
}

export const handleQueryStringParameters = (req: Request, res: Response, next: NextFunction) => {
  req.queryStringParameters = req.query;
  req.multiValueQueryStringParameters = req.query;
  next();
}

This way, we simply inserted the middleware before invoking each corresponding Lambda, making all parameter parsing completely transparent.

Automating Routes with Swagger

Since we were already using a Swagger file to document all routes, we took the opportunity to automate endpoint creation in Express. We built a script that parses the Swagger definition, generates the route skeletons, and organizes the folders based on the predefined structure.

**Automation of routes using Swagger file.**

After automating the route generation, the only manual task left was to connect each Express route to its corresponding Lambda and ensure that responses were properly handled.

Replacing AWS Cognito

Authentication was one of the most critical parts of the migration, originally managed by AWS Cognito. To preserve the same authentication and authorization rules, we built our own JWT-based authentication service, replicating the original permission validation logic. Here’s an example of the middleware implementation:

export class AuthService {
  public static readonly JWT_SECRET = process.env.JWT_SECRET;

  public async authenticateMiddleware(req: Request, res: Response, next: NextFunction): Promise<void> {
    const authHeader = req.headers.authorization;

    if (req.originalUrl.includes('/files/image')) {
      next();
      return;
    }

    if (!authHeader) {
      res.status(403).send({ status: '403', message: 'You do not have permission to access this resource' });
      return;
    }

    const token = authHeader.replace('Bearer ', '');

    try {
      const decoded = jwt.verify(token, AuthService.JWT_SECRET) as { userArn: string, 'custom:accountId': number };
      const accountId = decoded['custom:accountId'];

      // Quick cache check
      const userEndpoints = CACHE[token];
      if (userEndpoints) {
        const hasPermission = userPermissionService.has(userEndpoints, req.originalUrl, req.method);
        if (hasPermission) {
          next();
          return;
        } else {
          res.status(403).send({ status: '403', message: 'You do not have permission to access this resource' });
          return;
        }
      }

      try {
        const { hasPermission, endpoints } = await userPermissionService.verifyUserPermission(
          decoded.userArn, accountId, req.originalUrl, req.method
        );
        CACHE[token] = endpoints;
        setTimeout(() => CACHE[token] = undefined, 1000 * 60); // Expires in 1 minute

        if (hasPermission) {
          next();
          return;
        } else {
          res.status(403).send({ status: '403', message: 'You do not have permission to access this resource' });
          return;
        }
      } catch (error) {
        logger.error(error);
        res.status(500).json({ message: 'Internal server error' });
        return;
      }
    } catch (error: any) {
      logger.info(error);
      res.status(401).json({ message: 'Invalid or expired token' });
      return;
    }
  }

  public async verifyToken(token: string): Promise<boolean> {
    try {
      jwt.verify(token, AuthService.JWT_SECRET) as { id: number };
      return true;
    } catch (error: any) {
      return false;
    }
  }
}

This approach allowed us to retain the same authentication and authorization rules as Cognito, but in a fully independent setup — using JWTs and managing all permissions directly through the database.

Migrating the Frontend

The frontend migration was relatively straightforward. Our original stack already used React for the user interface, distributed through AWS CloudFront. In the on premise environment, we kept the same React build but served the static files using an Nginx server instead.

We manually configured SSL certificates to ensure that the entire application remained accessible over HTTPS.

Monitoring and Logging

Another important aspect of the migration was handling system logs and monitoring. In the AWS environment, AWS CloudWatch provided centralized logging, metrics, and alerting for all our services, making it easy to visualize logs, track system health, and investigate incidents.

With the move to on premise, we needed a way to ensure the same level of observability and reliability, but using open-source tools under our own control.

Log Management with logrotate

For log management, we configured logrotate across all service containers and host systems. Each service writes its logs to local files, while logrotate takes care of automated log rotation, compression, and retention based on policies we define. This ensures that logs remain easy to inspect without risking disk space exhaustion.

/var/log/postgresql/*.log {
    daily
    dateext
    dateformat -%Y-%m-%d
    rotate __DB_LOGROTATE_ROTATE__
    maxage __DB_LOGROTATE_MAXAGE__
    size __DB_LOGROTATE_SIZE__
    missingok
    notifempty
    compress
    copytruncate
}

Metrics and Monitoring with Prometheus

For real-time monitoring, metrics collection, and alerting, we adopted Prometheus. Each server runs a node_exporter agent, exposing system metrics — such as CPU, memory, disk, and network usage — through a dedicated endpoint. Prometheus scrapes these endpoints at regular intervals, storing the data in its time-series database.

In an on premise environment, having a strong alerting system is critical for maintaining reliability and quickly catching issues. Once we moved off the cloud, early detection of anomalies — like unexpected spikes in resource usage, service downtime, or performance degradation — became even more essential.

With Prometheus, we were able to configure custom alert rules tailored to our platform’s specific needs. These alerts act as an early warning system, instantly notifying the team whenever something goes wrong, so we can respond quickly and prevent small issues from becoming major problems:

groups:
  - name: cpu_alerts
    rules:
      - alert: CPUUsageWarning
        expr: (100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)) > 60
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"
          description: "The CPU usage is above 50% for more than 5 minutes."
      - alert: CPUUsageCritical
        expr: (100 - (avg by(instance)(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)) > 80
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High CPU usage detected"
          description: "The CPU usage is above 80% for more than 5 minutes."

Conclusion

Migrating the solution from the cloud to an on premise setup was a major step — and overall, it turned out to be a great success. We preserved all the critical features our clients relied on while gaining much greater control over the environment. One of the biggest wins was eliminating cold start delays: with all services always running, everything now responds instantly.

Of course, it hasn’t been without challenges. Managing everything ourselves means taking on responsibilities that the cloud previously handled for us — including hardware maintenance, backups, and security updates. We lost some of the convenience and peace of mind that comes with AWS taking care of those “invisible” tasks behind the scenes.

Still, having full visibility and flexibility has made it easier to customize, monitor, and scale the platform to meet each client’s unique needs — especially for those requiring fully offline operation. In the end, it’s a trade-off: more responsibility, but also greater freedom, control, and reliability for the scenarios that matter most to this client.