How to run your own AI Model Server with Claris FileMaker 2025.

With Claris FileMaker 2025, you can now run your own AI Model Server using local infrastructure. Whether you’re integrating text generation, text embedding, or image embedding models, self-hosting gives you full control over performance, cost, and data security—no cloud dependency required.

In this blog, you'll learn best practices for running a self-hosted AI server with FileMaker 2025, including:

Hardware and GPU recommendations.
Tips for managing model performance and memory.
Key pitfalls to avoid when deploying production-ready large language models (LLMs).
Security and configuration guidance.

Plus, find detailed setup steps, model recommendations, and infrastructure tips in the Claris FileMaker 2025 white paper.

Why run your own AI Model Server?

If you work with sensitive data, need to reduce latency, or want to fine-tune AI behavior, a self-hosted AI environment is ideal. Key benefits include:

On-premise privacy: Keep sensitive data inside your own infrastructure.
Performance: Use dedicated GPU hardware for low-latency inference.
Customization: Choose and fine-tune your own LLMs.
Cost-control: There’s no metered cloud pricing.
Compliance: Align with internal or regulatory requirements.

FileMaker 2025 makes this process easier by integrating AI server management into the Admin Console, but optimal setup still requires deliberate configuration.

Self-hosting best practices for FileMaker AI.

1. Use a dedicated AI server.

Running the AI Model Server on the same machine as FileMaker Server can cause performance bottlenecks.

Do this:

Use a separate physical or virtual machine.
Disable the AI Services tab on your main FileMaker Server to avoid accidental use.
Rightsize your compute resources based on model size. There’s more info on this below.

2. GPU acceleration Is required.

The Claris AI Model Server does not support CPU-only models like GGUF.

GPU requirements:

Windows/Ubuntu: Use NVIDIA GPUs with CUDA enabled (minimum 24 GB VRAM, ideally 32 GB+ for larger models).
macOS: Use Apple silicon (M1, M2, M3, M4), and choose or convert models to the MLX format.
RAM guidance: 32 GB is a starting point; 64–96 GB or more is needed for large LLMs.

3. Avoid switching between models at runtime.

The AI Model Server can only load one model per type at a time:

One text generation model.
One text embedding model.
One image embedding model.

If your custom app calls two models of the same type, such as two LLMs, the server will unload and reload each model dynamically, causing significant performance delays.

4. Balance model size with quality.

Quantization reduces model size and memory needs, but it can impact output quality.

Recommendations:

Use 8-bit quantization for most production use cases.
Avoid going below 8-bit unless thoroughly tested.
Know CUDA-enabled systems support on-the-fly quantization (16 to 8-bit), but it’s resource-intensive.

5. Disable token logging in production.

Enabling token usage logging can degrade model inference speed and fill logs unnecessarily.

Turn it off before going live.

Secure your AI Infrastructure.

Running a private AI server puts you in control and in charge of security.

Security tips:

Require API keys for access to your AI endpoints and Retrieval-Augmented Generation (RAG) store.
Install a valid SSL certificate—not the default self-signed one—to avoid warnings and meet compliance standards like HIPAA, GDPR, and more.
Use HTTPS to protect data in transit.

What about retrieval-augmented generation (RAG)?

If your solution includes RAG, you’ll need:

Sufficient memory to load both the model and the RAG vector store.
API key protection enabled to avoid exposing your private data.

You'll find RAG covered in detail in the FileMaker white paper. Download PDF.

AI Model Server FAQs.

Q: Can I run the AI Model Server on the same machine as FileMaker Server?

A: Technically yes, but discouraged. Using a dedicated machine prevents performance degradation and misconfiguration.

Q: Can I use CPU-only models?

A: No, FileMaker 2025's AI Model Server doesn’t support GGUF or CPU-only models.

Q: What’s the easiest way to get started?

A: Use the FileMaker Server 2025 installer on a dedicated Mac with Apple silicon or Windows/Linux machine with CUDA-enabled NVIDIA GPUs.

Q: Is there a UI for managing models?

A: Yes, the Admin Console in FileMaker 2025 now supports model configuration, selection, and monitoring.

Get the full guide.

Want setup scripts, model recommendations, and hardware benchmarks? Find those details and more in the AI Model Server white paper. Download it now.