Evaluating Heroku alternatives? Start here

Blog

Building a Go microservice using gRPC for image metadata extraction

Learn how to build a production-grade Go microservice that extracts EXIF and image metadata using gRPC, complete with S3/R2 integration and a realistic architecture for media pipelines.

·by Steve McDougall

Modern media-heavy platforms survive or fail based on their ability to understand the media flowing through them. Whether you are building a stock photo platform, a digital asset manager, or a content automation pipeline, metadata forms the backbone of search, classification, and analytics.

Across modern media pipelines, a metadata service eventually becomes essential. Every system needs a reliable way to answer a fundamental question: "What exactly is this image?" Camera model, dimensions, geolocation, and colour space are not just technical details. They power search, filtering, ranking, and display logic downstream.

This article walks through designing and implementing a production-grade Go microservice that exposes a gRPC API for extracting image metadata. Rather than focusing on a simplified example, the implementation follows a realistic architecture. Clients send object storage references such as S3 keys or Cloudflare R2 paths. The service retrieves the file, extracts EXIF and general image metadata, and returns a structured response.

By the end, you will have a fully runnable service that can integrate directly into a larger media ingestion pipeline.

The realistic problem

Assume we're building a media ingestion system for a stock-photo platform. Creators upload high-resolution images into a storage bucket. A separate ingestion orchestrator sends a request to the metadata service, saying: "Extract metadata for the file at images/uploads/2025/11/beach-sunrise.jpg."

We don't want this micro-service to handle uploads or storage writes. It only performs fetch, extract, and respond. This separation ensures ingestion pipelines stay flexible, the storage layer remains the source of truth, the service is stateless and horizontally scalable, and you can plug in different storage backends later.

The service needs to extract EXIF metadata (camera model, exposure, aperture, geolocation), dimensions, colour profile, orientation, file type, and optionally file size.

We'll build this using Go 1.22+, gRPC with Protocol Buffers, AWS S3 or Cloudflare R2 via the S3 API, and go-exif / imaging libraries.

Project structure

Before writing any code, it helps to establish the directory structure. Having a clear layout from the start makes it easier to understand where everything belongs as the service grows.

metadata-service/
├── cmd/
│   └── metadata/
│       └── main.go
├── internal/
│   ├── extractor/
│   │   ├── exif.go
│   │   └── image.go
│   ├── service/
│   │   └── service.go
│   └── storage/
│       └── s3.go
├── proto/
│   └── metadata/
│       └── v1/
│           └── metadata.proto
├── gen/
│   └── metadata/
│       └── v1/
│           ├── metadata.pb.go
│           └── metadata_grpc.pb.go
├── docker-compose.yml
├── Dockerfile
├── Makefile
├── go.mod
└── go.sum

The cmd/ directory holds our application entrypoint. The internal/ directory contains packages that shouldn't be imported by external projects. The proto/ directory stores our Protocol Buffer definitions, and gen/ holds the generated Go code.

Initializing the project

Create the project directory and initialize the Go module:

mkdir metadata-service && cd metadata-service
go mod init github.com/juststeveking/metadata-service

Now let's install our dependencies:

go get google.golang.org/grpc
go get google.golang.org/protobuf
go get github.com/aws/aws-sdk-go-v2/config
go get github.com/aws/aws-sdk-go-v2/service/s3
go get github.com/rwcarlsen/goexif/exif
go get github.com/disintegration/imaging

Your go.mod file should look something like this:

File: go.mod

module github.com/juststeveking/metadata-service
 
go 1.22
 
require (
	github.com/aws/aws-sdk-go-v2/config v1.28.0
	github.com/aws/aws-sdk-go-v2/service/s3 v1.65.0
	github.com/disintegration/imaging v1.6.2
	github.com/rwcarlsen/goexif v0.0.0-20190401172101-9e8deecbddbd
	google.golang.org/grpc v1.67.0
	google.golang.org/protobuf v1.35.1
)

The Makefile

Before we go further, let's set up a Makefile that will make our development workflow painless. I always create this early because it documents the commands I'll be running repeatedly.

File: Makefile

.PHONY: help proto build run test clean lint fmt deps docker-build docker-up docker-down grpc-test
 
# Default target
help:
	@echo "Available targets:"
	@echo "  make deps        - Install dependencies"
	@echo "  make proto       - Generate Go code from protobuf"
	@echo "  make build       - Build the binary"
	@echo "  make run         - Run the service locally"
	@echo "  make test        - Run tests"
	@echo "  make lint        - Run linter"
	@echo "  make fmt         - Format code"
	@echo "  make clean       - Remove build artifacts"
	@echo "  make docker-build - Build Docker image"
	@echo "  make docker-up   - Start Docker Compose stack"
	@echo "  make docker-down - Stop Docker Compose stack"
	@echo "  make grpc-test   - Test gRPC endpoint with grpcurl"
 
# Install dependencies
deps:
	go mod download
	go mod tidy
 
# Install protoc plugins
proto-deps:
	go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
	go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
 
# Generate protobuf code
proto: proto-deps
	@mkdir -p gen/metadata/v1
	protoc \
		--proto_path=proto \
		--go_out=gen \
		--go_opt=paths=source_relative \
		--go-grpc_out=gen \
		--go-grpc_opt=paths=source_relative \
		proto/metadata/v1/metadata.proto
 
# Build the binary
build:
	CGO_ENABLED=0 go build -o bin/metadata-service ./cmd/metadata
 
# Run locally
run:
	go run ./cmd/metadata
 
# Run with environment variables for local MinIO
run-local: docker-up
	AWS_ACCESS_KEY_ID=minioadmin \
	AWS_SECRET_ACCESS_KEY=minioadmin \
	AWS_ENDPOINT_URL=http://localhost:9000 \
	AWS_REGION=us-east-1 \
	go run ./cmd/metadata
 
# Run tests
test:
	go test -v -race -cover ./...
 
# Run linter
lint:
	golangci-lint run ./...
 
# Format code
fmt:
	go fmt ./...
	goimports -w .
 
# Clean build artifacts
clean:
	rm -rf bin/
	rm -rf gen/
 
# Build Docker image
docker-build:
	docker build -t metadata-service:latest .
 
# Start Docker Compose stack
docker-up:
	docker-compose up -d
 
# Stop Docker Compose stack
docker-down:
	docker-compose down
 
# Test gRPC endpoint
grpc-test:
	grpcurl -plaintext \
		-d '{"bucket":"images","object_key":"test.jpg"}' \
		localhost:50051 metadata.v1.MetadataService/ExtractMetadata
 
# Upload test image to MinIO
upload-test-image:
	@echo "Configuring MinIO client..."
	mc alias set local http://localhost:9000 minioadmin minioadmin 2>/dev/null || true
	@echo "Uploading test image..."
	mc cp $(IMAGE) local/images/test.jpg
	@echo "Done. Run 'make grpc-test' to test the service."

This Makefile covers the entire development lifecycle. Run make help to see all available commands.

Designing the protocol buffer schema

The .proto file defines our API contract. It must be explicit, strongly typed, and stable.

File: proto/metadata/v1/metadata.proto

syntax = "proto3";
 
package metadata.v1;
 
option go_package = "github.com/juststeveking/metadata-service/gen/metadata/v1;metadatav1";
 
message ExtractRequest {
  string bucket = 1;
  string object_key = 2;
}
 
message Exif {
  string camera_make = 1;
  string camera_model = 2;
  string lens_model = 3;
  string exposure_time = 4;
  string f_number = 5;
  string iso = 6;
  double latitude = 7;
  double longitude = 8;
}
 
message ImageProperties {
  uint32 width = 1;
  uint32 height = 2;
  string format = 3;
  string color_space = 4;
  uint64 file_size_bytes = 5;
}
 
message ExtractResponse {
  Exif exif = 1;
  ImageProperties properties = 2;
}
 
service MetadataService {
  rpc ExtractMetadata(ExtractRequest) returns (ExtractResponse);
}

The schema focuses on structured data, not blobs. Metadata should be queryable, not opaque.

Generate the Go code:

make proto

Building the storage layer

The AWS Go SDK works with both AWS S3 and Cloudflare R2 because R2 exposes an S3-compatible API.

File: internal/storage/s3.go

package storage
 
import (
	"context"
	"fmt"
	"io"
 
	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/service/s3"
)
 
const maxFileSize = 20 * 1024 * 1024
 
type S3Client struct {
	client *s3.Client
}
 
func NewS3Client(ctx context.Context) (*S3Client, error) {
	cfg, err := config.LoadDefaultConfig(ctx)
	if err != nil {
		return nil, fmt.Errorf("failed to load AWS config: %w", err)
	}
 
	return &S3Client{
		client: s3.NewFromConfig(cfg),
	}, nil
}
 
func (s *S3Client) FetchFile(ctx context.Context, bucket, key string) ([]byte, error) {
	out, err := s.client.GetObject(ctx, &s3.GetObjectInput{
		Bucket: &bucket,
		Key:    &key,
	})
	if err != nil {
		return nil, fmt.Errorf("failed to get object: %w", err)
	}
	defer out.Body.Close()
 
	return io.ReadAll(io.LimitReader(out.Body, maxFileSize))
}

In production, you'd stream and only buffer what you need, but this is enough for a clear example.

Building the extractors

I like to separate extraction logic into focused packages. EXIF parsing and image property extraction are distinct concerns.

File: internal/extractor/exif.go

package extractor
 
import (
	"bytes"
 
	"github.com/rwcarlsen/goexif/exif"
 
	pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
)
 
func ParseExif(data []byte) *pb.Exif {
	x, err := exif.Decode(bytes.NewReader(data))
	if err != nil {
		return nil
	}
 
	out := &pb.Exif{}
 
	if cam, err := x.Get(exif.Make); err == nil && cam != nil {
		out.CameraMake, _ = cam.StringVal()
	}
 
	if model, err := x.Get(exif.Model); err == nil && model != nil {
		out.CameraModel, _ = model.StringVal()
	}
 
	if lens, err := x.Get(exif.LensModel); err == nil && lens != nil {
		out.LensModel, _ = lens.StringVal()
	}
 
	if exp, err := x.Get(exif.ExposureTime); err == nil && exp != nil {
		out.ExposureTime, _ = exp.StringVal()
	}
 
	if fnum, err := x.Get(exif.FNumber); err == nil && fnum != nil {
		out.FNumber, _ = fnum.StringVal()
	}
 
	if iso, err := x.Get(exif.ISOSpeedRatings); err == nil && iso != nil {
		out.Iso, _ = iso.StringVal()
	}
 
	if lat, lon, err := x.LatLong(); err == nil {
		out.Latitude = lat
		out.Longitude = lon
	}
 
	return out
}

Real-world EXIF is a nightmare: inconsistent, missing, corrupted. Always tolerate that. Notice how we return nil on decode failure rather than propagating the error—EXIF failure shouldn't kill the entire extraction.

File: internal/extractor/image.go

package extractor
 
import (
	"bytes"
	"fmt"
 
	"github.com/disintegration/imaging"
 
	pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
)
 
func ParseImageProperties(data []byte) (*pb.ImageProperties, error) {
	img, err := imaging.Decode(bytes.NewReader(data))
	if err != nil {
		return nil, fmt.Errorf("failed to decode image: %w", err)
	}
 
	bounds := img.Bounds()
 
	return &pb.ImageProperties{
		Width:         uint32(bounds.Dx()),
		Height:        uint32(bounds.Dy()),
		Format:        detectFormat(data),
		ColorSpace:    "sRGB",
		FileSizeBytes: uint64(len(data)),
	}, nil
}
 
func detectFormat(data []byte) string {
	if len(data) < 12 {
		return "unknown"
	}
 
	switch {
	case bytes.HasPrefix(data, []byte{0xFF, 0xD8, 0xFF}):
		return "jpeg"
	case bytes.HasPrefix(data, []byte{0x89, 0x50, 0x4E, 0x47}):
		return "png"
	case bytes.HasPrefix(data, []byte("GIF")):
		return "gif"
	case bytes.HasPrefix(data, []byte("RIFF")) && string(data[8:12]) == "WEBP":
		return "webp"
	default:
		return "unknown"
	}
}

Building the gRPC service

Now we wire everything together into the actual service implementation.

File: internal/service/service.go

package service
 
import (
	"context"
	"log/slog"
 
	"google.golang.org/grpc/codes"
	"google.golang.org/grpc/status"
 
	pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
	"github.com/juststeveking/metadata-service/internal/extractor"
	"github.com/juststeveking/metadata-service/internal/storage"
)
 
type MetadataService struct {
	pb.UnimplementedMetadataServiceServer
	storage *storage.S3Client
	logger  *slog.Logger
}
 
func NewMetadataService(s3 *storage.S3Client, logger *slog.Logger) *MetadataService {
	return &MetadataService{
		storage: s3,
		logger:  logger,
	}
}
 
func (s *MetadataService) ExtractMetadata(
	ctx context.Context,
	req *pb.ExtractRequest,
) (*pb.ExtractResponse, error) {
	s.logger.Info("extracting metadata",
		"bucket", req.Bucket,
		"key", req.ObjectKey,
	)
 
	data, err := s.storage.FetchFile(ctx, req.Bucket, req.ObjectKey)
	if err != nil {
		s.logger.Error("failed to fetch file",
			"bucket", req.Bucket,
			"key", req.ObjectKey,
			"error", err,
		)
		return nil, status.Errorf(codes.NotFound, "failed to fetch file: %v", err)
	}
 
	s.logger.Info("file fetched",
		"bucket", req.Bucket,
		"key", req.ObjectKey,
		"size_bytes", len(data),
	)
 
	exifData := extractor.ParseExif(data)
 
	props, err := extractor.ParseImageProperties(data)
	if err != nil {
		s.logger.Error("failed to parse image properties",
			"bucket", req.Bucket,
			"key", req.ObjectKey,
			"error", err,
		)
		return nil, status.Errorf(codes.InvalidArgument, "invalid image: %v", err)
	}
 
	return &pb.ExtractResponse{
		Exif:       exifData,
		Properties: props,
	}, nil
}

Be strict where it matters and flexible where it doesn't. EXIF failure is tolerable; failing to decode the image at all is not.

The application entrypoint

File: cmd/metadata/main.go

package main
 
import (
	"context"
	"log/slog"
	"net"
	"os"
	"os/signal"
	"syscall"
 
	"google.golang.org/grpc"
	"google.golang.org/grpc/reflection"
 
	pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
	"github.com/juststeveking/metadata-service/internal/service"
	"github.com/juststeveking/metadata-service/internal/storage"
)
 
func main() {
	logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
		Level: slog.LevelInfo,
	}))
 
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()
 
	s3Client, err := storage.NewS3Client(ctx)
	if err != nil {
		logger.Error("failed to create S3 client", "error", err)
		os.Exit(1)
	}
 
	listener, err := net.Listen("tcp", ":50051")
	if err != nil {
		logger.Error("failed to listen", "error", err)
		os.Exit(1)
	}
 
	server := grpc.NewServer()
	svc := service.NewMetadataService(s3Client, logger)
 
	pb.RegisterMetadataServiceServer(server, svc)
	reflection.Register(server)
 
	go func() {
		sigCh := make(chan os.Signal, 1)
		signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
		<-sigCh
 
		logger.Info("shutting down gracefully")
		server.GracefulStop()
		cancel()
	}()
 
	logger.Info("metadata service listening", "port", 50051)
	if err := server.Serve(listener); err != nil {
		logger.Error("serve error", "error", err)
		os.Exit(1)
	}
}

I've added gRPC reflection so we can use grpcurl without manually specifying the proto file.

Local development with Docker Compose

We need MinIO running locally to simulate S3.

File: docker-compose.yml

services:
  minio:
    image: minio/minio:latest
    ports:
      - "9000:9000"
      - "9001:9001"
    environment:
      MINIO_ROOT_USER: minioadmin
      MINIO_ROOT_PASSWORD: minioadmin
    command: server /data --console-address ":9001"
    volumes:
      - minio_data:/data
 
  createbucket:
    image: minio/mc:latest
    depends_on:
      - minio
    entrypoint: >
      /bin/sh -c "
      sleep 5;
      mc alias set local http://minio:9000 minioadmin minioadmin;
      mc mb local/images --ignore-existing;
      exit 0;
      "
 
volumes:
  minio_data:

Production Dockerfile

File: Dockerfile

FROM golang:1.22-alpine AS builder
 
WORKDIR /app
 
COPY go.mod go.sum ./
RUN go mod download
 
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o metadata-service ./cmd/metadata
 
FROM gcr.io/distroless/static-debian12
 
COPY --from=builder /app/metadata-service /metadata-service
 
EXPOSE 50051
 
ENTRYPOINT ["/metadata-service"]

Running the service

Now let's put it all together. Start the infrastructure:

make docker-up

Upload a test image to MinIO:

make upload-test-image IMAGE=~/path/to/your/test-image.jpg

Run the service:

make run-local

In another terminal, test with grpcurl:

make grpc-test

You should see output like:

{
  "exif": {
    "cameraMake": "Canon",
    "cameraModel": "EOS R5",
    "iso": "400",
    "latitude": 51.509865,
    "longitude": -0.118092
  },
  "properties": {
    "width": 8192,
    "height": 5464,
    "format": "jpeg",
    "colorSpace": "sRGB",
    "fileSizeBytes": 24388123
  }
}

Production considerations

This service is intentionally minimal, but there are a few things you'll want to add for real production use.

  • Timeouts: Wrap your context with a deadline. No request should take longer than a few seconds.
  • Memory: Avoid loading whole files when possible. Production pipelines regularly see images over 100MB. For real robustness, use image.DecodeConfig instead of fully decoding just to get dimensions.
  • Rate limiting: Internal ingestion systems can go wild under load. Use gRPC interceptors to enforce request rates and body limits.
  • Observability: Add Prometheus metrics for request counts, latencies, and error rates. Use OpenTelemetry for distributed tracing.

Final thoughts

This design mirrors what production media companies actually deploy: a clean, stateless gRPC micro-service that speaks in storage keys, not bytes, and extracts only what downstream systems need. It's scalable, observable, easy to test, easy to extend, and plays well inside larger ingestion workflows.

The key principles is to treat storage as the source of truth, make EXIF optional (not mandatory), enforce strict boundaries and timeouts, keep the API strongly typed, and avoid cleverness. Do the obvious, robust thing.

If you're building a real media processing pipeline, this service is an essential building block. And you now have a complete, runnable blueprint for a reliable one.

Deep dive into the cloud!

Stake your claim on the Interwebz today with Sevalla's platform!
Deploy your application, database, or static site in minutes.

Get started