Building a Go microservice using gRPC for image metadata extraction
Learn how to build a production-grade Go microservice that extracts EXIF and image metadata using gRPC, complete with S3/R2 integration and a realistic architecture for media pipelines.
Modern media-heavy platforms survive or fail based on their ability to understand the media flowing through them. Whether you are building a stock photo platform, a digital asset manager, or a content automation pipeline, metadata forms the backbone of search, classification, and analytics.
Across modern media pipelines, a metadata service eventually becomes essential. Every system needs a reliable way to answer a fundamental question: "What exactly is this image?" Camera model, dimensions, geolocation, and colour space are not just technical details. They power search, filtering, ranking, and display logic downstream.
This article walks through designing and implementing a production-grade Go microservice that exposes a gRPC API for extracting image metadata. Rather than focusing on a simplified example, the implementation follows a realistic architecture. Clients send object storage references such as S3 keys or Cloudflare R2 paths. The service retrieves the file, extracts EXIF and general image metadata, and returns a structured response.
By the end, you will have a fully runnable service that can integrate directly into a larger media ingestion pipeline.
The realistic problem
Assume we're building a media ingestion system for a stock-photo platform. Creators upload high-resolution images into a storage bucket. A separate ingestion orchestrator sends a request to the metadata service, saying: "Extract metadata for the file at images/uploads/2025/11/beach-sunrise.jpg."
We don't want this micro-service to handle uploads or storage writes. It only performs fetch, extract, and respond. This separation ensures ingestion pipelines stay flexible, the storage layer remains the source of truth, the service is stateless and horizontally scalable, and you can plug in different storage backends later.
The service needs to extract EXIF metadata (camera model, exposure, aperture, geolocation), dimensions, colour profile, orientation, file type, and optionally file size.
We'll build this using Go 1.22+, gRPC with Protocol Buffers, AWS S3 or Cloudflare R2 via the S3 API, and go-exif / imaging libraries.
Project structure
Before writing any code, it helps to establish the directory structure. Having a clear layout from the start makes it easier to understand where everything belongs as the service grows.
metadata-service/
├── cmd/
│ └── metadata/
│ └── main.go
├── internal/
│ ├── extractor/
│ │ ├── exif.go
│ │ └── image.go
│ ├── service/
│ │ └── service.go
│ └── storage/
│ └── s3.go
├── proto/
│ └── metadata/
│ └── v1/
│ └── metadata.proto
├── gen/
│ └── metadata/
│ └── v1/
│ ├── metadata.pb.go
│ └── metadata_grpc.pb.go
├── docker-compose.yml
├── Dockerfile
├── Makefile
├── go.mod
└── go.sumThe cmd/ directory holds our application entrypoint. The internal/ directory contains packages that shouldn't be imported by external projects. The proto/ directory stores our Protocol Buffer definitions, and gen/ holds the generated Go code.
Initializing the project
Create the project directory and initialize the Go module:
mkdir metadata-service && cd metadata-service
go mod init github.com/juststeveking/metadata-serviceNow let's install our dependencies:
go get google.golang.org/grpc
go get google.golang.org/protobuf
go get github.com/aws/aws-sdk-go-v2/config
go get github.com/aws/aws-sdk-go-v2/service/s3
go get github.com/rwcarlsen/goexif/exif
go get github.com/disintegration/imagingYour go.mod file should look something like this:
File: go.mod
module github.com/juststeveking/metadata-service
go 1.22
require (
github.com/aws/aws-sdk-go-v2/config v1.28.0
github.com/aws/aws-sdk-go-v2/service/s3 v1.65.0
github.com/disintegration/imaging v1.6.2
github.com/rwcarlsen/goexif v0.0.0-20190401172101-9e8deecbddbd
google.golang.org/grpc v1.67.0
google.golang.org/protobuf v1.35.1
)The Makefile
Before we go further, let's set up a Makefile that will make our development workflow painless. I always create this early because it documents the commands I'll be running repeatedly.
File: Makefile
.PHONY: help proto build run test clean lint fmt deps docker-build docker-up docker-down grpc-test
# Default target
help:
@echo "Available targets:"
@echo " make deps - Install dependencies"
@echo " make proto - Generate Go code from protobuf"
@echo " make build - Build the binary"
@echo " make run - Run the service locally"
@echo " make test - Run tests"
@echo " make lint - Run linter"
@echo " make fmt - Format code"
@echo " make clean - Remove build artifacts"
@echo " make docker-build - Build Docker image"
@echo " make docker-up - Start Docker Compose stack"
@echo " make docker-down - Stop Docker Compose stack"
@echo " make grpc-test - Test gRPC endpoint with grpcurl"
# Install dependencies
deps:
go mod download
go mod tidy
# Install protoc plugins
proto-deps:
go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
# Generate protobuf code
proto: proto-deps
@mkdir -p gen/metadata/v1
protoc \
--proto_path=proto \
--go_out=gen \
--go_opt=paths=source_relative \
--go-grpc_out=gen \
--go-grpc_opt=paths=source_relative \
proto/metadata/v1/metadata.proto
# Build the binary
build:
CGO_ENABLED=0 go build -o bin/metadata-service ./cmd/metadata
# Run locally
run:
go run ./cmd/metadata
# Run with environment variables for local MinIO
run-local: docker-up
AWS_ACCESS_KEY_ID=minioadmin \
AWS_SECRET_ACCESS_KEY=minioadmin \
AWS_ENDPOINT_URL=http://localhost:9000 \
AWS_REGION=us-east-1 \
go run ./cmd/metadata
# Run tests
test:
go test -v -race -cover ./...
# Run linter
lint:
golangci-lint run ./...
# Format code
fmt:
go fmt ./...
goimports -w .
# Clean build artifacts
clean:
rm -rf bin/
rm -rf gen/
# Build Docker image
docker-build:
docker build -t metadata-service:latest .
# Start Docker Compose stack
docker-up:
docker-compose up -d
# Stop Docker Compose stack
docker-down:
docker-compose down
# Test gRPC endpoint
grpc-test:
grpcurl -plaintext \
-d '{"bucket":"images","object_key":"test.jpg"}' \
localhost:50051 metadata.v1.MetadataService/ExtractMetadata
# Upload test image to MinIO
upload-test-image:
@echo "Configuring MinIO client..."
mc alias set local http://localhost:9000 minioadmin minioadmin 2>/dev/null || true
@echo "Uploading test image..."
mc cp $(IMAGE) local/images/test.jpg
@echo "Done. Run 'make grpc-test' to test the service."This Makefile covers the entire development lifecycle. Run make help to see all available commands.
Designing the protocol buffer schema
The .proto file defines our API contract. It must be explicit, strongly typed, and stable.
File: proto/metadata/v1/metadata.proto
syntax = "proto3";
package metadata.v1;
option go_package = "github.com/juststeveking/metadata-service/gen/metadata/v1;metadatav1";
message ExtractRequest {
string bucket = 1;
string object_key = 2;
}
message Exif {
string camera_make = 1;
string camera_model = 2;
string lens_model = 3;
string exposure_time = 4;
string f_number = 5;
string iso = 6;
double latitude = 7;
double longitude = 8;
}
message ImageProperties {
uint32 width = 1;
uint32 height = 2;
string format = 3;
string color_space = 4;
uint64 file_size_bytes = 5;
}
message ExtractResponse {
Exif exif = 1;
ImageProperties properties = 2;
}
service MetadataService {
rpc ExtractMetadata(ExtractRequest) returns (ExtractResponse);
}The schema focuses on structured data, not blobs. Metadata should be queryable, not opaque.
Generate the Go code:
make protoBuilding the storage layer
The AWS Go SDK works with both AWS S3 and Cloudflare R2 because R2 exposes an S3-compatible API.
File: internal/storage/s3.go
package storage
import (
"context"
"fmt"
"io"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/s3"
)
const maxFileSize = 20 * 1024 * 1024
type S3Client struct {
client *s3.Client
}
func NewS3Client(ctx context.Context) (*S3Client, error) {
cfg, err := config.LoadDefaultConfig(ctx)
if err != nil {
return nil, fmt.Errorf("failed to load AWS config: %w", err)
}
return &S3Client{
client: s3.NewFromConfig(cfg),
}, nil
}
func (s *S3Client) FetchFile(ctx context.Context, bucket, key string) ([]byte, error) {
out, err := s.client.GetObject(ctx, &s3.GetObjectInput{
Bucket: &bucket,
Key: &key,
})
if err != nil {
return nil, fmt.Errorf("failed to get object: %w", err)
}
defer out.Body.Close()
return io.ReadAll(io.LimitReader(out.Body, maxFileSize))
}In production, you'd stream and only buffer what you need, but this is enough for a clear example.
Building the extractors
I like to separate extraction logic into focused packages. EXIF parsing and image property extraction are distinct concerns.
File: internal/extractor/exif.go
package extractor
import (
"bytes"
"github.com/rwcarlsen/goexif/exif"
pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
)
func ParseExif(data []byte) *pb.Exif {
x, err := exif.Decode(bytes.NewReader(data))
if err != nil {
return nil
}
out := &pb.Exif{}
if cam, err := x.Get(exif.Make); err == nil && cam != nil {
out.CameraMake, _ = cam.StringVal()
}
if model, err := x.Get(exif.Model); err == nil && model != nil {
out.CameraModel, _ = model.StringVal()
}
if lens, err := x.Get(exif.LensModel); err == nil && lens != nil {
out.LensModel, _ = lens.StringVal()
}
if exp, err := x.Get(exif.ExposureTime); err == nil && exp != nil {
out.ExposureTime, _ = exp.StringVal()
}
if fnum, err := x.Get(exif.FNumber); err == nil && fnum != nil {
out.FNumber, _ = fnum.StringVal()
}
if iso, err := x.Get(exif.ISOSpeedRatings); err == nil && iso != nil {
out.Iso, _ = iso.StringVal()
}
if lat, lon, err := x.LatLong(); err == nil {
out.Latitude = lat
out.Longitude = lon
}
return out
}Real-world EXIF is a nightmare: inconsistent, missing, corrupted. Always tolerate that. Notice how we return nil on decode failure rather than propagating the error—EXIF failure shouldn't kill the entire extraction.
File: internal/extractor/image.go
package extractor
import (
"bytes"
"fmt"
"github.com/disintegration/imaging"
pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
)
func ParseImageProperties(data []byte) (*pb.ImageProperties, error) {
img, err := imaging.Decode(bytes.NewReader(data))
if err != nil {
return nil, fmt.Errorf("failed to decode image: %w", err)
}
bounds := img.Bounds()
return &pb.ImageProperties{
Width: uint32(bounds.Dx()),
Height: uint32(bounds.Dy()),
Format: detectFormat(data),
ColorSpace: "sRGB",
FileSizeBytes: uint64(len(data)),
}, nil
}
func detectFormat(data []byte) string {
if len(data) < 12 {
return "unknown"
}
switch {
case bytes.HasPrefix(data, []byte{0xFF, 0xD8, 0xFF}):
return "jpeg"
case bytes.HasPrefix(data, []byte{0x89, 0x50, 0x4E, 0x47}):
return "png"
case bytes.HasPrefix(data, []byte("GIF")):
return "gif"
case bytes.HasPrefix(data, []byte("RIFF")) && string(data[8:12]) == "WEBP":
return "webp"
default:
return "unknown"
}
}Building the gRPC service
Now we wire everything together into the actual service implementation.
File: internal/service/service.go
package service
import (
"context"
"log/slog"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
"github.com/juststeveking/metadata-service/internal/extractor"
"github.com/juststeveking/metadata-service/internal/storage"
)
type MetadataService struct {
pb.UnimplementedMetadataServiceServer
storage *storage.S3Client
logger *slog.Logger
}
func NewMetadataService(s3 *storage.S3Client, logger *slog.Logger) *MetadataService {
return &MetadataService{
storage: s3,
logger: logger,
}
}
func (s *MetadataService) ExtractMetadata(
ctx context.Context,
req *pb.ExtractRequest,
) (*pb.ExtractResponse, error) {
s.logger.Info("extracting metadata",
"bucket", req.Bucket,
"key", req.ObjectKey,
)
data, err := s.storage.FetchFile(ctx, req.Bucket, req.ObjectKey)
if err != nil {
s.logger.Error("failed to fetch file",
"bucket", req.Bucket,
"key", req.ObjectKey,
"error", err,
)
return nil, status.Errorf(codes.NotFound, "failed to fetch file: %v", err)
}
s.logger.Info("file fetched",
"bucket", req.Bucket,
"key", req.ObjectKey,
"size_bytes", len(data),
)
exifData := extractor.ParseExif(data)
props, err := extractor.ParseImageProperties(data)
if err != nil {
s.logger.Error("failed to parse image properties",
"bucket", req.Bucket,
"key", req.ObjectKey,
"error", err,
)
return nil, status.Errorf(codes.InvalidArgument, "invalid image: %v", err)
}
return &pb.ExtractResponse{
Exif: exifData,
Properties: props,
}, nil
}Be strict where it matters and flexible where it doesn't. EXIF failure is tolerable; failing to decode the image at all is not.
The application entrypoint
File: cmd/metadata/main.go
package main
import (
"context"
"log/slog"
"net"
"os"
"os/signal"
"syscall"
"google.golang.org/grpc"
"google.golang.org/grpc/reflection"
pb "github.com/juststeveking/metadata-service/gen/metadata/v1"
"github.com/juststeveking/metadata-service/internal/service"
"github.com/juststeveking/metadata-service/internal/storage"
)
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
Level: slog.LevelInfo,
}))
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
s3Client, err := storage.NewS3Client(ctx)
if err != nil {
logger.Error("failed to create S3 client", "error", err)
os.Exit(1)
}
listener, err := net.Listen("tcp", ":50051")
if err != nil {
logger.Error("failed to listen", "error", err)
os.Exit(1)
}
server := grpc.NewServer()
svc := service.NewMetadataService(s3Client, logger)
pb.RegisterMetadataServiceServer(server, svc)
reflection.Register(server)
go func() {
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGINT, syscall.SIGTERM)
<-sigCh
logger.Info("shutting down gracefully")
server.GracefulStop()
cancel()
}()
logger.Info("metadata service listening", "port", 50051)
if err := server.Serve(listener); err != nil {
logger.Error("serve error", "error", err)
os.Exit(1)
}
}I've added gRPC reflection so we can use grpcurl without manually specifying the proto file.
Local development with Docker Compose
We need MinIO running locally to simulate S3.
File: docker-compose.yml
services:
minio:
image: minio/minio:latest
ports:
- "9000:9000"
- "9001:9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
command: server /data --console-address ":9001"
volumes:
- minio_data:/data
createbucket:
image: minio/mc:latest
depends_on:
- minio
entrypoint: >
/bin/sh -c "
sleep 5;
mc alias set local http://minio:9000 minioadmin minioadmin;
mc mb local/images --ignore-existing;
exit 0;
"
volumes:
minio_data:Production Dockerfile
File: Dockerfile
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o metadata-service ./cmd/metadata
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/metadata-service /metadata-service
EXPOSE 50051
ENTRYPOINT ["/metadata-service"]Running the service
Now let's put it all together. Start the infrastructure:
make docker-upUpload a test image to MinIO:
make upload-test-image IMAGE=~/path/to/your/test-image.jpgRun the service:
make run-localIn another terminal, test with grpcurl:
make grpc-testYou should see output like:
{
"exif": {
"cameraMake": "Canon",
"cameraModel": "EOS R5",
"iso": "400",
"latitude": 51.509865,
"longitude": -0.118092
},
"properties": {
"width": 8192,
"height": 5464,
"format": "jpeg",
"colorSpace": "sRGB",
"fileSizeBytes": 24388123
}
}Production considerations
This service is intentionally minimal, but there are a few things you'll want to add for real production use.
- Timeouts: Wrap your context with a deadline. No request should take longer than a few seconds.
- Memory: Avoid loading whole files when possible. Production pipelines regularly see images over 100MB. For real robustness, use
image.DecodeConfiginstead of fully decoding just to get dimensions. - Rate limiting: Internal ingestion systems can go wild under load. Use gRPC interceptors to enforce request rates and body limits.
- Observability: Add Prometheus metrics for request counts, latencies, and error rates. Use OpenTelemetry for distributed tracing.
Final thoughts
This design mirrors what production media companies actually deploy: a clean, stateless gRPC micro-service that speaks in storage keys, not bytes, and extracts only what downstream systems need. It's scalable, observable, easy to test, easy to extend, and plays well inside larger ingestion workflows.
The key principles is to treat storage as the source of truth, make EXIF optional (not mandatory), enforce strict boundaries and timeouts, keep the API strongly typed, and avoid cleverness. Do the obvious, robust thing.
If you're building a real media processing pipeline, this service is an essential building block. And you now have a complete, runnable blueprint for a reliable one.