_images/logo.png

动机

PostgREST 是一个独立的 Web 服务器,可为您的 PostgreSQL 数据库直接生成 RESTful API。 数据库中的结构约束和权限决定了 API 的端点(endpoints)和操作。

PostgREST 是手动编写 CRUD 的替代方案。通常的 API 服务器普遍遇到遇到一个问题。那就是编写业务逻辑往往会重复劳动,并且对于数据结构存在疏漏。对象关系(Object-relational)映射是一种存疑(leaky)的抽象,可能导致产生低效的代码。PostgREST 哲学建立了一个单一的声明性真理来源————数据本身。

声明性编程

比起对着查询结果重复劳动,让查询计划理清细节使 PostgreSQL 为您添加数据是一件更容易的事。为数据库对象分配权限比在控制器(controllers)中添加保护更容易(这对于数据依赖关系中的级联权限尤其如此)。并且设置约束比编写重复代码进行健全检查更容易。

Leak-proof 抽象

没有 ORM 参与。创建 SQL 性能提示的视图。数据库管理员(DA)现在可以自行创建 API,无需开发参与。

拥抱关系模型

1970年,E. F. Codd 在他的文章 “大型共享数据库数据关系模型” 中批评了当时主导的数据库分层模型。于都文章会发现层次数据库和嵌套 http 路由之间存在惊人相似性。而在 PostgREST 中,我们尝试使用灵活的过滤和嵌入,而不是嵌套路由。

一个重点

PostgREST 有一个重要的特点。它适用于如 Nginx 的工具。这可以强行将以数据为中心的 CRUD 操作与其他问题进行干净分离。

改进共享

与任何开源项目一样,我们都可以从工具中的功能和修复中获益。这比在各自私有库中耦合的改进更有益处。

生态系统

PostgREST 具有不断增长的生态系统,示例、库、实验和用户。这是一个选择。

客户端库

额外通知

在于外部交互上(LISTEN/NOTIFY)PostgreSQL 有拓展到外部队列进行进一步处理的网桥。这允许存储过程在数据库外发起动作,例如发送电子邮件。

示例应用

拓展

广告

  • subZero - Automated GraphQL & REST API with built-in caching (powered in part by PostgREST)

赞誉

"开发起来太快了, 感觉就像在作弊!"

—François-G. Ribreau

"我不得不说, 与 Node.js/Waterline ORM 构建的 API 对比
CPU/Memory usage 简直是难以置信. 当我们在 6 个示例 (dynos) 持续求情 1GB 数据是它甚至只有 60/70 MB 大小."

—Louis Brauer

"我非常喜欢这样一个事实,偶然使用 SQL DDL(和 V8 javascript)开发微服务。 我们在 6 个月内完全重写了一个 Spring + MySQL 遗留应用程序。 速度快 10 倍,代码很简洁。而之前的人用了 4 个人花了 3 年时间。"

—Simone Scarduzio

获得支持

该项目有一个友好且不断成长的社区。加入我们的 聊天室 来讨论和求助。同时你也可以在 Github 的 issues 上搜索 bugs/features。

Tutorial 0 - 让它跑起来

欢迎使用 PostgREST!该前言是一个 Quick start,帮助你快速创建第一个简单的 API。

PostgREST 是一个独立的 Web 服务器,为 PostgreSQL 数据库生成 RESTful API。 它提供基于底层数据库结构定制的 API。

_images/tut0-request-flow.png

想要生成 API,我们只需要创建一个数据库。所有端点和权限都来自数据库对象,如表、视图、角色和存储过程。这些教程将涵盖许多常见的情况以及如何在数据库中对它们进行建模。

在本教程结束的时候,您将拥有一个能用的数据库,PostgREST 服务器和一个简单的单用户 todo list API。

Step 1. 放轻松老铁, 我们会帮你的

在你开始这个教程时, Ctrl+T 一下在新标签中打开项目 聊天室 . 有一群很 nice 的人在聊天室中活跃,如果你卡住了我们会帮你的。

Step 2. 安装 PostgreSQL

您将需要在本机或 Docker 实例中运行一个数据库的现代副本(modern copy)。我们需要 PostgreSQL 9.3 或更高版本,但建议至少 9.5 用于 row-level 安全性功能(在后面的教程中会用到)。

如果您已经熟悉 PostgreSQL 的使用并在本地有安装,可以直接使用现有的数据库。在本教程中,我们将介绍如何在 Docker 中使用数据库,否则数据库配置对于简单的教程来说太复杂了。

如果你没有安装 Docker 可以点 这里. 安装好了之后,让我们来拉去并启动数据库的镜像:

sudo docker run --name tutorial -p 5432:5432 \
                -e POSTGRES_PASSWORD=mysecretpassword \
                -d postgres

以上操作会以守护进程方式运行 Docker 实例并且暴露一个 5432 端来供你访问 PostgreSQL server。

Step 3. 安装 PostgREST

PostgREST 是作为一个单独的二进制文件发布的,支持 Linux/BSD/Windows 的主要发行版。访问 最新版本 以获取下载列表。如果您的平台不是预先构建的平台,请参阅 源代码编译 以获取有关如何自行构建的说明。也可以通知我们在下一个版本中添加您的平台。

用于下载的二进制文件是 .tar.xz 压缩文件(除了 Windows 是 zip 文件)。 要提取二进制文件,进入终端并运行

# download from https://github.com/begriffs/postgrest/releases/latest

tar xfJ postgrest-<version>-<platform>.tar.xz

你会得到一个名为 postgrest (Windows 上是 postgrest.exe) 的文件. 到了这一步,可以尝试运行

./postgrest

如果一切正常,它将打印出有关配置的版本和信息。您可以继续从您下载的文件目录运行它,也可以将其复制到系统目录,如 Linux上 的 /usr/local/bin ,以便您可以从任何目录运行它。

注解

PostgREST 依赖 libpq (PostgreSQL 的 C 语言库)的安装。没有这个库的话你会获得一个形如 "error while loading shared libraries: libpq.so.5." 的报错,以下是解决方案:

Ubuntu or Debian
sudo apt-get install libpq-dev
Fedora, CentOS, or Red Hat
sudo yum install postgresql-libs
OS X
brew install postgresql
Windows

It isn't fun. Learn more here.

It might be easier to execute PostgREST in its own Docker image as well.

Step 4. 为 API 创建数据库

为了连上容器内的 SQL 控制台 (psql),你需要运行如下命令:

sudo docker exec -it tutorial psql -U postgres

你应该看到了 psql 的命令行提示:

psql (9.6.3)
Type "help" for help.

postgres=#

我们要做的第一件事是为要暴露在 API 中的数据库对象创建一个 命名的 schema。我们可以使用任何我们喜欢的名称,那么就叫 "api" 怎么样。在你刚刚启动的命令行工具内执行该操作:

create schema api;

我们的 API 准备通过表来设置一个端点 /todos

create table api.todos (
  id serial primary key,
  done boolean not null default false,
  task text not null,
  due timestamptz
);

insert into api.todos (task) values
  ('finish tutorial 0'), ('pat self on back');

接下来,创建一个角色来用于进行匿名的 web 请求。当一个请求进来,PostgREST 会在数据库中切换到该角色进行查询。

create role web_anon nologin;
grant web_anon to postgres;

grant usage on schema api to web_anon;
grant select on api.todos to web_anon;

web_anon 角色拥有访问 api schema 的权限,可以读取 todos 表中的数据(rows)。

现在可以退出 psql, 是时候开始使用 API 了!

\q

Step 5. 运行 PostgREST

PostgREST 使用一个配置文件来确定如何连接数据库。创建一个文件 tutorial.conf 并加上如下内容:

db-uri = "postgres://postgres:mysecretpassword@localhost/postgres"
db-schema = "api"
db-anon-role = "web_anon"

详细配置内容参见 options。现在可以运行服务器:

./postgrest tutorial.conf

你应该看到

Listening on port 3000
Attempting to connect to the database...
Connection successful

现在可以进行 web 请求了。市面上有很多可以用的图形化 API 请求工具,不过在本教程内我们使用 curl。打开一个新的 terminal (保持 PostgREST 依旧运行)。尝试对 todos 做一个 HTTP 请求。

curl http://localhost:3000/todos

API 返回:

[
  {
    "id": 1,
    "done": false,
    "task": "finish tutorial 0",
    "due": null
  },
  {
    "id": 2,
    "done": false,
    "task": "pat self on back",
    "due": null
  }
]

通过当前的角色权限,匿名请求有 todos 表的只读权限。如果我们试图添加一个新的 todo 会被拒绝。

curl http://localhost:3000/todos -X POST \
     -H "Content-Type: application/json" \
     -d '{"task": "do bad thing"}'

响应是 401 Unauthorized:

{
  "hint": null,
  "details": null,
  "code": "42501",
  "message": "permission denied for relation todos"
}

There we have it, a basic API on top of the database! 在下一篇教程中,我们将会看到如何拓展这个例子,使用更复杂的用户访问控制,以及更多的表和查询。

Tutorial 1 - 金钥匙

Tutorial 0 - 让它跑起来 中我们创建了一个获取 todos 数据只读的 API。使用单个端点列出 todos。 我们有很多办法可以使这个 API 更有趣,但一个好的开始是允许一些用户更改数据,而不仅仅是查询。

Step 1. 添加一个受信的用户

上一节中,进行匿名 Web 请求的事后在数据库中创建了一个 web_anon 角色。让我们在创建一个角色叫做 todo_user 用于使用 API 进行身份验证的用户,这个角色将有权对 todo list 做任何事情。

-- run this in psql using the database created
-- in the previous tutorial

create role todo_user nologin;
grant todo_user to postgres;

grant usage on schema api to todo_user;
grant all on api.todos to todo_user;
grant usage, select on sequence api.todos_id_seq to todo_user;

Step 2. 生成一个密码

客户端通过 API 使用 JSON Web Token 进行身份验证。JTW 是使用仅有我们和服务器知道的密码进行加密签名的 JSON 对象。 由于客户端不知道密码,所以不能篡改 token 的内容。 PostgREST 会检测伪造的 token 并拒绝它们。

我们来创建一个密码并提供给 PostgREST。最好想想一个复杂的长一点的,或使用一个工具来生成它。

注解

OpenSSL toolkit 提个一个简单的方式来生成安全的密码。如果你有安装,运行

openssl rand -base64 32

打开 tutorial.conf (在上一节中创建的) 并将密码添加在新的一行:

# add this line to tutorial.conf

jwt-secret = "<the password you created>"

如果 PostgREST server 仍旧在运行中,那么需要重启它以便加载最新的配置文件。

Step 3. 生成 token

通常你自己的代码在数据库或其他服务器中将创建并签署身份验证 token,但是在本教程中,我们将“自己动手”。跳转到 jwt.io,并填写如下字段:

jwt.io interface

如何在 https://jwt.io 创建 Token

请记住您填写的密码,而不是图片里的 secret。填写密码和 payload 之后,左侧的编码数据会刷新,该数据即 token 复制它。

注解

虽然令牌可能看起来很模糊,但很容易逆向出的 payload。token 仅仅是被签名,没有加密,所以如果你有不想让客户端看到的信息请不要放在里面。

Step 4. 进行请求

回到 terminal,我们来用 curl 添加一个 todo。该请求将包括一个包含身份验证 token 的 HTTP 头。

export TOKEN="<paste token here>"

curl http://localhost:3000/todos -X POST \
     -H "Authorization: Bearer $TOKEN"   \
     -H "Content-Type: application/json" \
     -d '{"task": "learn how to auth"}'

现在我们已经完成了我们的 todo list 中的所有三个项目,所以我们通过 PATCH 请求将他们全设置为 done

curl http://localhost:3000/todos -X PATCH \
     -H "Authorization: Bearer $TOKEN"    \
     -H "Content-Type: application/json"  \
     -d '{"done": true}'

请求一下 todo 看看这三项,全部都已完成了.

curl http://localhost:3000/todos
[
  {
    "id": 1,
    "done": true,
    "task": "finish tutorial 0",
    "due": null
  },
  {
    "id": 2,
    "done": true,
    "task": "pat self on back",
    "due": null
  },
  {
    "id": 3,
    "done": true,
    "task": "learn how to auth",
    "due": null
  }
]

Step 4. 添加过期时间

目前,我们的认证 token 对于所有请求都是一致有效的。服务器只要继续使用相同的 JWT 密码,就会通过验证。

更好的策略是让 token 使用 exp 声明一个过期时间戳。这是 PostgREST 特别对待的两个 JWT 声明之一。

Claim Interpretation
role The database role under which to execute SQL for API request
exp Expiration timestamp for token, expressed in "Unix epoch time"

注解

Unix 时间戳 (Unix epoch time) 被定义为自 1970 年 1 月 1 日 00:00:00 协调世界时(UTC)以来到现在的总秒数,不考虑闰秒。

为了在行动中观察过期,我们将添加一个在 5min 之后过期的 exp 声明。首先找到从当前时间算起到 5min 之后的时间戳。 在 psql 中运行:

select extract(epoch from now() + '5 minutes'::interval) :: integer;

回到 jwt.io 并修改 payload

{
  "role": "todo_user",
  "exp": "<computed epoch value>"
}

拷贝新的 token,然后将其保存为一个新的环境变量。

export NEW_TOKEN="<paste new token>"

尝试在过期时间的前后使用 curl 进行该请求:

curl http://localhost:3000/todos \
     -H "Authorization: Bearer $NEW_TOKEN"

过期以后, 该 API 会返回一个 HTTP 401 Unauthorized:

{"message":"JWT expired"}

附加题: 立即撤销

Even with token expiration there are times when you may want to immediately revoke access for a specific token. For instance, suppose you learn that a disgruntled employee is up to no good and his token is still valid.

To revoke a specific token we need a way to tell it apart from others. Let's add a custom email claim that matches the email of the client issued the token.

Go ahead and make a new token with the payload

{
  "role": "todo_user",
  "email": "disgruntled@mycompany.com"
}

Save it to an environment variable:

export WAYWARD_TOKEN="<paste new token>"

PostgREST allows us to specify a stored procedure to run during attempted authentication. The function can do whatever it likes, including raising an exception to terminate the request.

First make a new schema and add the function:

create schema auth;
grant usage on schema auth to web_anon, todo_user;

create or replace function auth.check_token() returns void
  language plpgsql
  as $$
begin
  if current_setting('request.jwt.claim.email', true) =
     'disgruntled@mycompany.com' then
    raise insufficient_privilege
      using hint = 'Nope, we are on to you';
  end if;
end
$$;

Next update tutorial.conf and specify the new function:

# add this line to tutorial.conf

pre-request = "auth.check_token"

Restart PostgREST for the change to take effect. Next try making a request with our original token and then with the revoked one.

# this request still works

curl http://localhost:3000/todos \
     -H "Authorization: Bearer $TOKEN"

# this one is rejected

curl http://localhost:3000/todos \
     -H "Authorization: Bearer $WAYWARD_TOKEN"

The server responds with 403 Forbidden:

{
  "hint": "Nope, we are on to you",
  "details": null,
  "code": "42501",
  "message": "insufficient_privilege"
}

可执行文件

[ 下载页 ]

下载页面具有 Mac OS X、Windows 和几个 Linux 发行版的预编译文件。解压之后可以运行可执行文件加 --help 标志来查看使用说明:

# 解压 tar 包 (available at https://github.com/begriffs/postgrest/releases/latest)

$ tar Jxf postgrest-[version]-[platform].tar.xz

# 尝试运行
$ ./postgrest --help

# You should see a usage help message

Homebrew

在 Mac 上你可以使用 Homebrew 来安装 PostgREST

# 确保 brew 是最新的
brew update

# 检查 brew 的 setup 有没问题
brew doctor

# 安装 postgrest
brew install postgrest

该命令会自动将 PostgreSQL 当做依赖安装. 该过程往往需要长达15分钟才能安装软件包及其依赖。

安装完成后,该工具会被添加到 $PATH 中,你可以在任意位置使用:

postgrest --help

PostgreSQL 依赖

要使用 PostgREST 您将需要安装数据库(PostgreSQL 9.3 或更高版本)。 您可以使用像 Amazon RDS 这样的东西,但是在本地安装本身比较便宜,更便于开发。

源代码编译

注解

我们不建议在 Alpine Linux 上构建和使用 PostgREST,因为在该平台有过 GHC 内存泄漏的报告。

当您的系统没有预构建的可执行文件时,可以从源代码构建项目。如果您想帮助开发,您还需要这种操作操作:安装 Stack 。它将在您的系统上安装任何必要的 Haskell 依赖。

  • 安装 Stack

  • 安装依赖库

    Operating System Dependencies
    Ubuntu/Debian libpq-dev, libgmp-dev
    CentOS/Fedora/Red Hat postgresql-devel, zlib-devel, gmp-devel
    BSD postgresql95-server
    OS X postgresql, gmp
  • 构建并安装

    git clone https://github.com/begriffs/postgrest.git
    cd postgrest
    
    # adjust local-bin-path to taste
    stack build --install-ghc --copy-bins --local-bin-path /usr/local/bin
    

注解

如果你构建失败,而且你的系统只有不到 1GB 内存,尝试添加一个 swap 文件。

  • 检查安装是否成功: postgrest --help.

PostgREST 测试套件

创建测试库

为了正确运行postgrest进行测试,首先需要创建一个数据库。为此,请使用:code:test/`目录下为测试准备的建库脚本:code:`create_test_database

脚本需要以下参数:

test/create_test_db connection_uri database_name [test_db_user] [test_db_user_password]

使用`connection URI <https://www.postgresql.org/docs/current/static/libpq-connect.html#AEN45347>`_ 去指定数据库用户、密码、主机以及端口。不要在数据库连接URI中提供数据库。用于连接的Postgres必须拥有能够创建新数据库的能力。

脚本中的:code:`database_name`参数,是将要连接到的数据库名称。如果服务器上已存在同名数据库,则脚本将删除该数据库,然后进行重新创建。

如果使用指定的数据库用户进行堆栈测试。每次测试运行后,用户都将获得重置数据库所需的权限。

如果未指定用户,脚本将会生成角色名:code:postgrest_test_,并以所选数据库名作为后缀,而且还会自动生成一个随机密码。

如果使用一个已经存在的用户来进行测试连接,那么还需要指定该用户的密码。

该脚本将返回测试过程中使用的数据uri - 该uri与将在生产中使用的配置文件参数:code:`db-uri`相对应。

生成用户和密码允许创建数据库并对任何postgres服务器运行测试,而无需对服务器进行任何修改。(例如,允许没有密码的帐户或设置信任身份验证,或要求服务器位于运行测试的同一本地主机上)。

运行测试

为了运行测试,必须在环境变量中提供数据库的uri信息,对应的变量名称为:code:POSTGREST_TEST_CONNECTION

通常情况下,创建数据库与运行测试命令会在同一命令行中执行,并且使用超级用户`postgres`:

POSTGREST_TEST_CONNECTION=$(test/create_test_db "postgres://postgres:pwd@database-host" test_db) stack test

在同一数据库中重复运行时,应该导出数据库连接变量信息:

export POSTGREST_TEST_CONNECTION=$(test/create_test_db "postgres://postgres:pwd@database-host" test_db)
stack test
stack test
...

如果环境变量为空或未指定,那么测试的运行程序将会默认连接uri

postgres://postgrest_test@localhost/postgrest_test

上述连接假定测试的服务器在本地:code:localhost:code:,并且数据库用户`postgrest_test`没有指定密码和同名的数据库。

销毁数据库

测试完成之后,测试数据库将会被保留,同时还会在postgres服务器上创建四个新角色。如果需要永久性删除已创建的数据库和角色,请使用与创建数据库相同的超级用户角色执行脚本:code:test/delete_test_database

test/destroy_test_db connection_uri database_name

使用 Docker 测试

为了简化连接非本地环境PostgreSQL的测试环境设置,可以使用一种非常简洁的方式,在docker中创建一个PostgreSQL。

例如,如果是在mac上做本地开发(且已经安装了Docker服务),可执行以下命令进行安装:

$ docker run --name db-scripting-test -e POSTGRES_PASSWORD=pwd -p 5434:5432 -d postgres
$ POSTGREST_TEST_CONNECTION=$(test/create_test_db "postgres://postgres:pwd@localhost:5434" test_db) stack test

此外,如果通过创建docker容器运行来运行堆栈测试(对于GHC低于8.0.1的MacOS Sierra是必要的,在:code:`stack test`会提示异常),你可以在单独的容器中运行PostgreSQL,也可以使用本地安装的Postgres.app。

使用以下脚本构建测试使用的容器:code:test/Dockerfile.test

$ docker build -t pgst-test - < test/Dockerfile.test
$ mkdir .stack-work-docker ~/.stack-linux

在测试容器首次运行时,将会花费较长的时间,因为需要缓存相应的依赖资源。创建:code:~/.stack-linux`文件夹作为容器的挂载卷,以确保我们在一次性模式下运行容器而且不必担心随后的运行会变的迟缓。:code:.stack-work-docker`同样需要映射至容器中,在使用Linux中的stack时必须指定,以免干扰本地开发的:code:.stack work。(在Sierra上:code:`stack build`可以正常使用,而:code:`stack test`在GHC 8.0.1中不会起作用)

文件夹映射至docker容器中:

$ docker run --name pg -e POSTGRES_PASSWORD=pwd  -d postgres
$ docker run --rm -it -v `pwd`:`pwd` -v ~/.stack-linux:/root/.stack --link pg:pg -w="`pwd`" -v `pwd`/.stack-work-docker:`pwd`/.stack-work pgst-test bash -c "POSTGREST_TEST_CONNECTION=$(test/create_test_db "postgres://postgres:pwd@pg" test_db) stack test"

在mac上,Docker的堆栈测试方式如下:

$ host_ip=$(ifconfig en0 | grep 'inet ' | cut -f 2 -d' ')
$ export POSTGREST_TEST_CONNECTION=$(test/create_test_db "postgres://postgres@$HOST" test_db)
$ docker run --rm -it -v `pwd`:`pwd` -v ~/.stack-linux:/root/.stack -v `pwd`/.stack-work-docker:`pwd`/.stack-work -e "HOST=$host_ip" -e "POSTGREST_TEST_CONNECTION=$POSTGREST_TEST_CONNECTION" -w="`pwd`" pgst-test bash -c "stack test"
$ test/destroy_test_db "postgres://postgres@localhost" test_db

配置

The PostgREST server reads a configuration file to determine information about the database and how to serve client requests. There is no predefined location for this file, you must specify the file path as the one and only argument to the server:

postgrest /path/to/postgrest.conf

The file must contain a set of key value pairs. At minimum you must include these keys:

# postgrest.conf

# The standard connection URI format, documented at
# https://www.postgresql.org/docs/current/static/libpq-connect.html#AEN45347
db-uri       = "postgres://user:pass@host:5432/dbname"

# The name of which database schema to expose to REST clients
db-schema    = "api"

# The database role to use when no client authentication is provided.
# Can (and probably should) differ from user in db-uri
db-anon-role = "anon"

The user specified in the db-uri is also known as the authenticator role. For more information about the anonymous vs authenticator roles see the 角色系统概述.

Here is the full list of configuration parameters.

Name Type Default Required
db-uri String   Y
db-schema String   Y
db-anon-role String   Y
db-pool Int 10  
server-host String *4  
server-port Int 3000  
server-proxy-uri String    
jwt-secret String    
secret-is-base64 Bool False  
max-rows Int  
pre-request String    
db-uri
The standard connection PostgreSQL URI format. Symbols and unusual characters in the password or other fields should be percent encoded to avoid a parse error. On older systems like Centos 6, with older versions of libpq, a different db-uri syntax has to be used. In this case the URI is a string of space separated key-value pairs (key=value), so the example above would be "host=host user=user port=5432 dbname=dbname password=pass". Also allows connections over Unix sockets for higher performance.
db-schema
The database schema to expose to REST clients. Tables, views and stored procedures in this schema will get API endpoints.
db-anon-role
The database role to use when executing commands on behalf of unauthenticated clients.
db-pool
Number of connections to keep open in PostgREST's database pool. Having enough here for the maximum expected simultaneous client connections can improve performance. Note it's pointless to set this higher than the max_connections GUC in your database.
server-host
Where to bind the PostgREST web server.
server-port
The port to bind the web server.
server-proxy-uri
Overrides the base URL used within the OpenAPI self-documentation hosted at the API root path. Use a complete URI syntax scheme:[//[user:password@]host[:port]][/]path[?query][#fragment]. Ex. https://postgrest.com
{
  "swagger": "2.0",
  "info": {
    "version": "0.4.0.0",
    "title": "PostgREST API",
    "description": "This is a dynamic API generated by PostgREST"
  },
  "host": "postgrest.com:443",
  "basePath": "/",
  "schemes": [
    "https"
  ]
}
jwt-secret
The secret used to decode JWT tokens clients provide for authentication. If this parameter is not specified then PostgREST refuses authentication requests. Choosing a value for this parameter beginning with the at sign such as @filename loads the secret out of an external file. This is useful for automating deployments. Note that any binary secrets must be base64 encoded.
secret-is-base64
When this is set to true, the value derived from jwt-secret will be treated as a base64 encoded secret.
max-rows
A hard limit to the number of rows PostgREST will fetch from a view, table, or stored procedure. Limits payload size for accidental or malicious requests.
pre-request
A schema-qualified stored procedure name to call right after switching roles for a client request. This provides an opportunity to modify SQL variables or raise an exception to prevent the request from completing.

启动 Server

PostgREST outputs basic request logging to stdout. When running it in an SSH session you must detach it from stdout or it will be terminated when the session closes. The easiest technique is redirecting the output to a logfile or to the syslog:

ssh foo@example.com \
  'postgrest foo.conf </dev/null >/var/log/postgrest.log 2>&1 &'

# another option is to pipe the output into "logger -t postgrest"

(Avoid nohup postgrest because the HUP signal is used for manual Schema 重载.)

硬化 PostgREST

PostgREST is a fast way to construct a RESTful API. Its default behavior is great for scaffolding in development. When it's time to go to production it works great too, as long as you take precautions. PostgREST is a small sharp tool that focuses on performing the API-to-database mapping. We rely on a reverse proxy like Nginx for additional safeguards.

The first step is to create an Nginx configuration file that proxies requests to an underlying PostgREST server.

http {
  ...
  # upstream configuration
  upstream postgrest {
    server localhost:3000;
    keepalive 64;
  }
  ...
  server {
    ...
    # expose to the outside world
    location /api/ {
      default_type  application/json;
      proxy_hide_header Content-Location;
      add_header Content-Location  /api/$upstream_http_content_location;
      proxy_set_header  Connection "";
      proxy_http_version 1.1;
      proxy_pass http://postgrest/;
    }
    ...
  }
}

阻止全表操作

Each table in the admin-selected schema gets exposed as a top level route. Client requests are executed by certain database roles depending on their authentication. All HTTP verbs are supported that correspond to actions permitted to the role. For instance if the active role can drop rows of the table then the DELETE verb is allowed for clients. Here's an API request to delete old rows from a hypothetical logs table:

DELETE /logs?time=lt.1991-08-06 HTTP/1.1

However it's very easy to delete the entire table by omitting the query parameter!

DELETE /logs HTTP/1.1

This can happen accidentally such as by switching a request from a GET to a DELETE. To protect against accidental operations use the pg-safeupdate PostgreSQL extension. It raises an error if UPDATE or DELETE are executed without specifying conditions. To install it you can use the PGXN network:

sudo -E pgxn install safeupdate

# then add this to postgresql.conf:
# shared_preload_libraries='safeupdate';

This does not protect against malicious actions, since someone can add a url parameter that does not affect the result set. To prevent this you must turn to database permissions, forbidding the wrong people from deleting rows, and using row-level security if finer access control is required.

Count-Header DoS

For convenience to client-side pagination controls PostgREST supports counting and reporting total table size in its response. As described in Limit 和分页, responses ordinarily include a range but leave the total unspecified like

HTTP/1.1 200 OK
Range-Unit: items
Content-Range: 0-14/*

However including the request header Prefer: count=exact calculates and includes the full count:

HTTP/1.1 206 Partial Content
Range-Unit: items
Content-Range: 0-14/3573458

This is fine in small tables, but count performance degrades in big tables due to the MVCC architecture of PostgreSQL. For very large tables it can take a very long time to retrieve the results which allows a denial of service attack. The solution is to strip this header from all requests:

Nginx stuff. Remove any prefer header which contains the word count

注解

In future versions we will support Prefer: count=estimated to leverage the PostgreSQL statistics tables for a fast (and fairly accurate) result.

HTTPS

See the ssl section of the authentication guide.

限速

Nginx supports "leaky bucket" rate limiting (see official docs). Using standard Nginx configuration, routes can be grouped into request zones for rate limiting. For instance we can define a zone for login attempts:

limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;

This creates a shared memory zone called "login" to store a log of IP addresses that access the rate limited urls. The space reserved, 10 MB (10m) will give us enough space to store a history of 160k requests. We have chosen to allow only allow one request per second (1r/s).

Next we apply the zone to certain routes, like a hypothetical stored procedure called login.

location /rpc/login/ {
  # apply rate limiting
  limit_req zone=login burst=5;
}

The burst argument tells Nginx to start dropping requests if more than five queue up from a specific IP.

Nginx rate limiting is general and indescriminate. To rate limit each authenticated request individually you will need to add logic in a Custom Validation function.

调试

The PostgREST server logs basic request information to stdout, including the requesting IP address and user agent, the URL requested, and HTTP response status. However this provides limited information for debugging server errors. It's helpful to get full information about both client requests and the corresponding SQL commands executed against the underlying database.

A great way to inspect incoming HTTP requests including headers and query params is to sniff the network traffic on the port where PostgREST is running. For instance on a development server bound to port 3000 on localhost, run this:

# sudo access is necessary for watching the network
sudo ngrep -d lo0 port 3000

The options to ngrep vary depending on the address and host on which you've bound the server. The binding is described in the Configuration section. The ngrep output isn't particularly pretty, but it's legible. Note the Server response header as well which identifies the version of server. This is important when submitting bug reports.

Once you've verified that requests are as you expect, you can get more information about the server operations by watching the database logs. By default PostgreSQL does not keep these logs, so you'll need to make the configuration changes below. Find postgresql.conf inside your PostgreSQL data directory (to find that, issue the command show data_directory;). Either find the settings scattered throughout the file and change them to the following values, or append this block of code to the end of the configuration file.

# send logs where the collector can access them
log_destination = "stderr"

# collect stderr output to log files
logging_collector = on

# save logs in pg_log/ under the pg data directory
log_directory = "pg_log"

# (optional) new log file per day
log_filename = "postgresql-%Y-%m-%d.log"

# log every kind of SQL statement
log_statement = "all"

Restart the database and watch the log file in real-time to understand how HTTP requests are being translated into SQL commands.

Schema 重载

Users are often confused by PostgREST's database schema cache. It is present because detecting foreign key relationships between tables (including how those relationships pass through views) is necessary, but costly. API requests consult the schema cache as part of 资源嵌套. However if the schema changes while the server is running it results in a stale cache and leads to errors claiming that no relations are detected between tables.

To refresh the cache without restarting the PostgREST server, send the server process a SIGHUP signal:

killall -HUP postgrest

In the future we're investigating ways to keep the cache updated without manual intervention.

备用 URL 结构

As discussed in 单数或复数, there are no special URL forms for singular resources in PostgREST, only operators for filtering. Thus there are no URLs like /people/1. It would be specified instead as

GET /people?id=eq.1
Accept: application/vnd.pgrst.object+json

This allows compound primary keys and makes the intent for singular response independent of a URL convention.

Nginx rewrite rules allow you to simulate the familiar URL convention. The following example adds a rewrite rule for all table endpoints, but you'll want to restrict it to those tables that have a numeric simple primary key named "id."

# support /endpoint/:id url style
location ~ ^/([a-z_]+)/([0-9]+) {

  # make the response singular
  proxy_set_header Accept 'application/vnd.pgrst.object+json';

  # assuming an upstream named "postgrest"
  proxy_pass http://postgrest/$1?id=eq.$2;

}

表 & 视图

All views and tables in the active schema and accessible by the active database role for a request are available for querying. They are exposed in one-level deep routes. For instance the full contents of a table people is returned at

GET /people HTTP/1.1

There are no deeply/nested/routes. Each route provides OPTIONS, GET, POST, PATCH, and DELETE verbs depending entirely on database permissions.

注解

Why not provide nested routes? Many APIs allow nesting to retrieve related information, such as /films/1/director. We offer a more flexible mechanism (inspired by GraphQL) to embed related information. It can handle one-to-many and many-to-many relationships. This is covered in the section about 资源嵌套.

水平过滤 (Rows)

You can filter result rows by adding conditions on columns, each condition a query string parameter. For instance, to return people aged under 13 years old:

GET /people?age=lt.13 HTTP/1.1

Adding multiple parameters conjoins the conditions:

GET /people?age=gte.18&student=is.true HTTP/1.1

These operators are available:

abbreviation meaning
eq equals
gte greater than or equal
gt greater than
lte less than or equal
lt less than
neq not equal
like LIKE operator (use * in place of %)
ilike ILIKE operator (use * in place of %)
in one of a list of values e.g. ?a=in.1,2,3 – also supports commas in quoted strings like ?a=in."hi,there","yes,you"
is checking for exact equality (null,true,false)
@@ full-text search using to_tsquery
@> contains e.g. ?tags=@>.{example, new}
<@ contained in e.g. ?values=<@{1,2,3}
not negates another operator, see below

To negate any operator, prefix it with not like ?a=not.eq.2.

For more complicated filters (such as those involving disjunctions) you will have to create a new view in the database, or use a stored procedure. For instance, here's a view to show "today's stories" including possibly older pinned stories:

CREATE VIEW fresh_stories AS
SELECT *
  FROM stories
 WHERE pinned = true
    OR published > now() - interval '1 day'
ORDER BY pinned DESC, published DESC;

The view will provide a new endpoint:

GET /fresh_stories HTTP/1.1

注解

We're working to extend the PostgREST query grammar to allow more complicated boolean logic, while continuing to prevent performance problems from arbitrary client queries.

垂直过滤 (Columns)

When certain columns are wide (such as those holding binary data), it is more efficient for the server to withold them in a response. The client can specify which columns are required using the select parameter.

GET /people?select=fname,age HTTP/1.1

The default is *, meaning all columns. This value will become more important below in 资源嵌套.

计算列

Filters may be applied to computed columns as well as actual table/view columns, even though the computed columns will not appear in the output. For example, to search first and last names at once we can create a computed column that will not appear in the output but can be used in a filter:

CREATE TABLE people (
  fname text,
  lname text
);

CREATE FUNCTION full_name(people) RETURNS text AS $$
  SELECT $1.fname || ' ' || $1.lname;
$$ LANGUAGE SQL;

-- (optional) add an index to speed up anticipated query
CREATE INDEX people_full_name_idx ON people
  USING GIN (to_tsvector('english', full_name(people)));

A full-text search on the computed column:

GET /people?full_name=@@.Beckett HTTP/1.1

As mentioned, computed columns do not appear in the output by default. However you can include them by listing them in the vertical filtering select param:

GET /people?select=*,full_name HTTP/1.1

排序

The reserved word order reorders the response rows. It uses a comma-separated list of columns and directions:

GET /people?order=age.desc,height.asc HTTP/1.1

If no direction is specified it defaults to ascending order:

GET /people?order=age HTTP/1.1

If you care where nulls are sorted, add nullsfirst or nullslast:

GET /people?order=age.nullsfirst HTTP/1.1
GET /people?order=age.desc.nullslast HTTP/1.1

You can also use 计算列 to order the results, even though the computed columns will not appear in the output.

Limit 和分页

PostgREST uses HTTP range headers to describe the size of results. Every response contains the current range and, if requested, the total number of results:

HTTP/1.1 200 OK
Range-Unit: items
Content-Range: 0-14/*

Here items zero through fourteen are returned. This information is available in every response and can help you render pagination controls on the client. This is an RFC7233-compliant solution that keeps the response JSON cleaner.

There are two ways to apply a limit and offset rows: through request headers or query params. When using headers you specify the range of rows desired. This request gets the first twenty people.

GET /people HTTP/1.1
Range-Unit: items
Range: 0-19

Note that the server may respond with fewer if unable to meet your request:

HTTP/1.1 200 OK
Range-Unit: items
Content-Range: 0-17/*

You may also request open-ended ranges for an offset with no limit, e.g. Range: 10-.

The other way to request a limit or offset is with query parameters. For example

GET /people?limit=15&offset=30 HTTP/1.1

This method is also useful for embedded resources, which we will cover in another section. The server always responds with range headers even if you use query parameters to limit the query.

In order to obtain the total size of the table or view (such as when rendering the last page link in a pagination control), specify your preference in a request header:

GET /bigtable HTTP/1.1
Range-Unit: items
Range: 0-24
Prefer: count=exact

Note that the larger the table the slower this query runs in the database. The server will respond with the selected range and total

HTTP/1.1 206 Partial Content
Range-Unit: items
Content-Range: 0-24/3573458

相应格式

PostgREST uses proper HTTP content negotiation (RFC7231) to deliver the desired representation of a resource. That is to say the same API endpoint can respond in different formats like JSON or CSV depending on the client request.

Use the Accept request header to specify the acceptable format (or formats) for the response:

GET /people HTTP/1.1
Accept: application/json

The current possibilities are

  • */*
  • text/csv
  • application/json
  • application/openapi+json
  • application/octet-stream

The server will default to JSON for API endpoints and OpenAPI on the root.

单数或复数

By default PostgREST returns all JSON results in an array, even when there is only one item. For example, requesting /items?id=eq.1 returns

[
  { "id": 1 }
]

This can be inconvenient for client code. To return the first result as an object unenclosed by an array, specify vnd.pgrst.object as part of the Accept header

GET /items?id=eq.1 HTTP/1.1
Accept: application/vnd.pgrst.object+json

This returns

{ "id": 1 }

When a singular response is requested but no entries are found, the server responds with an empty body and 404 status code rather than the usual empty array and 200 status.

注解

Many APIs distinguish plural and singular resources using a special nested URL convention e.g. /stories vs /stories/1. Why do we use /stories?id=eq.1? The answer is because a singlular resource is (for us) a row determined by a primary key, and primary keys can be compound (meaning defined across more than one column). The more familiar nested urls consider only a degenerate case of simple and overwhelmingly numeric primary keys. These so-called artificial keys are often introduced automatically by Object Relational Mapping libraries.

Admittedly PostgREST could detect when there is an equality condition holding on all columns constituting the primary key and automatically convert to singular. However this could lead to a surprising change of format that breaks unwary client code just by filtering on an extra column. Instead we allow manually specifying singular vs plural to decouple that choice from the URL format.

二进制输出

If you want to return raw binary data from a bytea column, you must specify application/octet-stream as part of the Accept header and select a single column ?select=bin_data.

GET /items?select=bin_data&id=eq.1 HTTP/1.1
Accept: application/octet-stream

注解

If more than one row would be returned the binary results will be concatenated with no delimiter.

资源嵌套

In addition to providing RESTful routes for each table and view, PostgREST allows related resources to be included together in a single API call. This reduces the need for multiple API requests. The server uses foreign keys to determine which tables and views can be returned together. For example, consider a database of films and their awards:

_images/film.png

As seen above in 垂直过滤 (Columns) we can request the titles of all films like this:

GET /films?select=title HTTP/1.1

This might return something like

[
  { "title": "Workers Leaving The Lumière Factory In Lyon" },
  { "title": "The Dickson Experimental Sound Film" },
  { "title": "The Haunted Castle" }
]

However because a foreign key constraint exists between Films and Directors, we can request this information be included:

GET /films?select=title,directors(last_name) HTTP/1.1

Which would return

[
  { "title": "Workers Leaving The Lumière Factory In Lyon",
    "directors": {
      "last_name": "Lumière"
    }
  },
  { "title": "The Dickson Experimental Sound Film",
    "directors": {
      "last_name": "Dickson"
    }
  },
  { "title": "The Haunted Castle",
    "directors": {
      "last_name": "Méliès"
    }
  }
]

注解

As of PostgREST v4.1, parens () are used rather than brackets {} for the list of embedded columns. Brackets are still supported, but are deprecated and will be removed in v5.

PostgREST can also detect relations going through join tables. Thus you can request the Actors for Films (which in this case finds the information through Roles). You can also reverse the direction of inclusion, asking for all Directors with each including the list of their Films:

GET /directors?select=films(title,year) HTTP/1.1

注解

Whenever foreign key relations change in the database schema you must refresh PostgREST's schema cache to allow resource embedding to work properly. See the section Schema 重载.

嵌入过滤和排序

Embedded tables can be filtered and ordered similarly to their top-level counterparts. To do so, prefix the query parameters with the name of the embedded table. For instance, to order the actors in each film:

GET /films?select=*,actors(*)&actors.order=last_name,first_name HTTP/1.1

This sorts the list of actors in each film but does not change the order of the films themselves. To filter the roles returned with each film:

GET /films?select=*,roles(*)&roles.character=in.Chico,Harpo,Groucho HTTP/1.1

Once again, this restricts the roles included to certain characters but does not filter the films in any way. Films without any of those characters would be included along with empty character lists.

自定义查询

The PostgREST URL grammar limits the kinds of queries clients can perform. It prevents arbitrary, potentially poorly constructed and slow client queries. It's good for quality of service, but means database administrators must create custom views and stored procedures to provide richer endpoints. The most common causes for custom endpoints are

  • Table unions and OR-conditions in the where clause
  • More complicated joins than those provided by 资源嵌套
  • Geospatial queries that require an argument, like "points near (lat,lon)"
  • More sophisticated full-text search than a simple use of the @@ filter

存储过程

Every stored procedure in the API-exposed database schema is accessible under the /rpc prefix. The API endpoint supports only POST which executes the function.

POST /rpc/function_name HTTP/1.1

Such functions can perform any operations allowed by PostgreSQL (read data, modify data, and even DDL operations). However procedures in PostgreSQL marked with stable or immutable volatility can only read, not modify, the database and PostgREST executes them in a read-only transaction compatible for read-replicas.

Procedures must be used with named arguments. To supply arguments in an API call, include a JSON object in the request payload and each key/value of the object will become an argument.

For instance, assume we have created this function in the database.

CREATE FUNCTION add_them(a integer, b integer)
RETURNS integer AS $$
 SELECT $1 + $2;
$$ LANGUAGE SQL IMMUTABLE STRICT;

The client can call it by posting an object like

POST /rpc/add_them HTTP/1.1

{ "a": 1, "b": 2 }

The keys of the object match the parameter names. Note that PostgreSQL converts parameter names to lowercase unless you quote them like CREATE FUNCTION foo("mixedCase" text) .... You can also call a function that takes a single parameter of type json by sending the header Prefer: params=single-object with your request. That way the JSON request body will be used as the single argument.

注解

We recommend using function arguments of type json to accept arrays from the client. To pass a PostgreSQL native array you'll need to quote it as a string:

POST /rpc/native_array_func HTTP/1.1

{ "arg": "{1,2,3}" }
POST /rpc/json_array_func HTTP/1.1

{ "arg": [1,2,3] }

PostgreSQL has four procedural languages that are part of the core distribution: PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. There are many other procedural languages distributed as additional extensions. Also, plain SQL can be used to write functions (as shown in the example above).

By default, a function is executed with the privileges of the user who calls it. This means that the user has to have all permissions to do the operations the procedure performs. Another option is to define the function with with the SECURITY DEFINER option. Then only one permission check will take place, the permission to call the function, and the operations in the function will have the authority of the user who owns the function itself. See PostgreSQL documentation for more details.

注解

Why the /rpc prefix? One reason is to avoid name collisions between views and procedures. It also helps emphasize to API consumers that these functions are not normal restful things. The functions can have arbitrary and surprising behavior, not the standard "post creates a resource" thing that users expect from the other routes.

We are considering allowing GET requests for functions that are marked non-volatile. Allowing GET is important for HTTP caching. However we still must decide how to pass function parameters since request bodies are not allowed. Also some query string arguments are already reserved for shaping/filtering the output.

获取请求的 Headers/Cookies

Stored procedures can access request headers and cookies by reading GUC variables set by PostgREST per request. They are named request.header.XYZ and request.cookie.XYZ. For example, to read the value of the Origin request header:

SELECT current_setting('request.header.origin', true);

复杂布尔逻辑

For complex boolean logic you can use stored procedures, an example:

CREATE FUNCTION key_customers(country TEXT, company TEXT, salary FLOAT) RETURNS SETOF customers AS $$
  SELECT * FROM customers WHERE (country = $1 AND company = $2) OR salary = $3;
$$ LANGUAGE SQL;

Then you can query by doing:

POST /rpc/key_customers HTTP/1.1

{ "country": "Germany", "company": "Volkswagen", salary": 120000.00 }

抛出错误

Stored procedures can return non-200 HTTP status codes by raising SQL exceptions. For instance, here's a saucy function that always errors:

CREATE OR REPLACE FUNCTION just_fail() RETURNS void
  LANGUAGE plpgsql
  AS $$
BEGIN
  RAISE EXCEPTION 'I refuse!'
    USING DETAIL = 'Pretty simple',
          HINT = 'There is nothing you can do.';
END
$$;

Calling the function returns HTTP 400 with the body

{
  "message":"I refuse!",
  "details":"Pretty simple",
  "hint":"There is nothing you can do.",
  "code":"P0001"
}

You can customize the HTTP status code by raising particular exceptions according to the PostgREST error to status code mapping. For example, RAISE insufficient_privilege will respond with HTTP 401/403 as appropriate.

插入/修改

All tables and auto-updatable views can be modified through the API, subject to permissions of the requester's database role.

To create a row in a database table post a JSON object whose keys are the names of the columns you would like to create. Missing properties will be set to default values when applicable.

POST /table_name HTTP/1.1

{ "col1": "value1", "col2": "value2" }

The response will include a Location header describing where to find the new object. If the table is write-only then constructing the Location header will cause a permissions error. To successfully insert an item to a write-only table you will need to suppress the Location response header by including the request header Prefer: return=minimal.

On the other end of the spectrum you can get the full created object back in the response to your request by including the header Prefer: return=representation. That way you won't have to make another HTTP call to discover properties that may have been filled in on the server side. You can also apply the standard 垂直过滤 (Columns) to these results.

注解

When inserting a row you must post a JSON object, not quoted JSON.

Yes
{ "a": 1, "b": 2 }

No
"{ \"a\": 1, \"b\": 2 }"

Some javascript libraries will post the data incorrectly if you're not careful. For best results try one of the 客户端库 built for PostgREST.

To update a row or rows in a table, use the PATCH verb. Use 水平过滤 (Rows) to specify which record(s) to update. Here is an exmaple query setting the category column to child for all people below a certain age.

PATCH /people?age=lt.13 HTTP/1.1

{ "category": "child" }

Updates also support Prefer: return=representation plus 垂直过滤 (Columns).

注解

Beware of accidentally updating every row in a table. To learn to prevent that see 阻止全表操作.

批量插入

Bulk insert works exactly like single row insert except that you provide either a JSON array of objects having uniform keys, or lines in CSV format. This not only minimizes the HTTP requests required but uses a single INSERT statement on the backend for efficiency. Note that using CSV requires less parsing on the server and is much faster.

To bulk insert CSV simply post to a table route with Content-Type: text/csv and include the names of the columns as the first row. For instance

POST /people HTTP/1.1
Content-Type: text/csv

name,age,height
J Doe,62,70
Jonas,10,55

An empty field (,,) is coerced to an empty string and the reserved word NULL is mapped to the SQL null value. Note that there should be no spaces between the column names and commas.

To bulk insert JSON post an array of objects having all-matching keys

POST /people HTTP/1.1
Content-Type: application/json

[
  { "name": "J Doe", "age": 62, "height": 70 },
  { "name": "Janus", "age": 10, "height": 55 }
]

删除

To delete rows in a table, use the DELETE verb plus 水平过滤 (Rows). For instance deleting inactive users:

DELETE /user?active=is.false HTTP/1.1

注解

Beware of accidentally deleting all rows in a table. To learn to prevent that see 阻止全表操作.

OpenAPI 支持

Every API hosted by PostgREST automatically serves a full OpenAPI description on the root path. This provides a list of all endpoints, along with supported HTTP verbs and example payloads.

You can use a tool like Swagger UI to create beautiful documentation from the description and to host an interactive web-based dashboard. The dashboard allows developers to make requests against a live PostgREST server, provides guidance with request headers and example request bodies.

注解

The OpenAPI information can go out of date as the schema changes under a running server. To learn how to refresh the cache see Schema 重载.

HTTP 状态码

PostgREST translates PostgreSQL error codes into HTTP status as follows:

PostgreSQL error code(s) HTTP status Error description
08* 503 pg connection err
09* 500 triggered action exception
0L* 403 invalid grantor
0P* 403 invalid role specification
23503 409 foreign key violation
23505 409 uniqueness violation
25* 500 invalid transaction state
28* 403 invalid auth specification
2D* 500 invalid transaction termination
38* 500 external routine exception
39* 500 external routine invocation
3B* 500 savepoint exception
40* 500 transaction rollback
53* 503 insufficient resources
54* 413 too complex
55* 500 obj not in prereq state
57* 500 operator intervention
58* 500 system error
F0* 500 conf file error
HV* 500 foreign data wrapper error
P0001 400 default code for "raise"
P0* 500 PL/pgSQL error
XX* 500 internal error
42883 404 undefined function
42P01 404 undefined table
42501 if authed 403, else 401 insufficient privileges
other 500  

角色系统概述

PostgREST旨在将数据库保持在API安全性的中心。 所有授权都通过数据库角色和权限进行。 PostgREST的工作是**验证**请求 - 即验证客户端是否是他们所说的 - 然后让数据库**授权**客户端操作。

验证序列

the authenticator, anonymous and user roles. PostgREST使用三种类型的角色,身份验证器**匿名**和**用户**角色。 数据库管理员创建这些角色并配置PostgREST以使用它们。

_images/security-roles.png

The authenticator 应该创建身份验证器:代码:`NOINHERIT`并在数据库中配置以获得非常有限的访问权限。 它是一个变色龙,其工作是“成为”其他用户来为经过身份验证的HTTP请求提供服务。 下图显示了服务器如何处理身份验证。 如果auth成功,它将切换到请求指定的用户角色,否则将切换到匿名角色。

_images/security-anon-choice.png

Here are the technical details. We use JSON Web Tokens to authenticate API requests.您可能还记得JWT包含加密签名声明的列表。 所有索赔都是允许的,但PostgREST特别关注一个名为角色role的声明。

{
  "role": "user123"
}

当请求包含具有角色声明的有效JWT时,PostgREST将在HTTP请求期间切换到具有该名称的数据库角色。

SET LOCAL ROLE user123;

请注意,通过先前的操作,数据库管理员必须允许身份验证者角色来切换到此用户

GRANT user123 TO authenticator;

如果客户端不包含JWT(或没有角色声明的JWT),则PostgREST将切换到匿名角色,该角色的实际数据库特定名称(如使用验证者角色的名称)在PostgREST服务器配置文件中指定。 数据库管理员必须正确设置匿名角色权限,以防止匿名用户查看或更改他们不应该访问的内容。

用户和组

PostgreSQL使用角色的概念管理数据库访问权限。 可以将角色role视为数据库用户或一组数据库用户,具体取决于角色的设置方式。

每个 Web 用户的角色

PostgREST可以适应任何一个观点。 如果您将角色视为单个用户,那么上述基于JWT的角色切换可以完成您所需的大部分工作。 当经过身份验证的用户发出请求时,PostgREST将切换到该用户的角色,除了限制查询之外,SQL还可以通过:code:`current_user`变量使用该角色。

您可以使用行级安全性灵活地限制当前用户的可见性和访问权限。 以下是来自Tomas Vondra的示例<http://blog.2ndquadrant.com/application-users-vs-row-level-security/>`_,这是一个存储用户之间发送的消息的聊天表。 用户可以在其中插入行以向其他用户发送消息,并查询它以查看其他用户发送给他们的消息。

CREATE TABLE chat (
  message_uuid    UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  message_time    TIMESTAMP NOT NULL DEFAULT now(),
  message_from    NAME      NOT NULL DEFAULT current_user,
  message_to      NAME      NOT NULL,
  message_subject VARCHAR(64) NOT NULL,
  message_body    TEXT
);

我们希望实施一项政策,确保用户只能看到他发送或打算发给他的那些消息。 此外,我们还希望阻止用户使用其他人的姓名伪造message_from列。

PostgreSQL(9.5及更高版本)允许我们使用行级安全性设置此策略:

CREATE POLICY chat_policy ON chat
  USING ((message_to = current_user) OR (message_from = current_user))
  WITH CHECK (message_from = current_user)

访问生成的聊天表API端点的任何人都将看到他们应该准确的行,而无需我们需要自定义命令式服务器端编码。

Web 用户共享角色

或者,数据库角色可以代表组而不是个别用户(或个别除外)。 您可以选择Web应用程序的所有已登录用户共享同一个webuser角色。 您可以通过在JWT中包含额外声明来甄别/排除具体某个用户,例如通过电子邮件。

{
  "role": "webuser",
  "email": "john@doe.com"
}

SQL代码可以通过PostgREST按请求设置的GUC变量访问声明。 例如,要获取电子邮件声明,请调用此函数:

current_setting('request.jwt.claim.email', true)

This allows JWT generation services to include extra information and your database code to react to it. For instance the RLS example could be modified to use this current_setting rather than current_user. The second 'true' argument tells current_setting to return NULL if the setting is missing from the current configuration.

混合用户组角色

拥有许多数据库角色没有性能损失,尽管角色是按群集命名而不是按数据库命名,因此可能容易在数据库中发生冲突。 如果需要,您可以自由为Web应用程序中的每个用户分配新角色。 您可以混合组和单个角色策略。 例如,我们仍然可以拥有一个webuser角色和从中继承的个人用户:

CREATE ROLE webuser NOLOGIN;
-- grant this role access to certain tables etc

CREATE ROLE user000 NOLOGIN;
GRANT webuser TO user000;
-- now user000 can do whatever webuser can

GRANT user000 TO authenticator;
-- allow authenticator to switch into user000 role
-- (the role itself has nologin)

自定义验证

PostgREST通过代码:`exp`声明令牌到期,拒绝过期的令牌。 但是,它不会强制执行任何额外的约束。 额外约束的一个示例是立即撤销对特定用户的访问。 配置文件参数:code:`pre-request`指定在验证者切换到新角色之后和主查询本身运行之前立即调用的存储过程。

这是一个例子。 在配置文件中指定存储过程:

pre-request = "public.check_user"

在该函数中,您可以运行任意代码来检查请求,并根据需要引发异常以阻止它。

CREATE OR REPLACE FUNCTION check_user() RETURNS void
  LANGUAGE plpgsql
  AS $$
BEGIN
  IF current_user = 'evil_user' THEN
    RAISE EXCEPTION 'No, you are evil'
      USING HINT = 'Stop being so evil and maybe you can log in';
  END IF;
END
$$;

客户端 Auth

要进行经过身份验证的请求,客户端必须包含:code:Authorization HTTP标头,其值为:code:Bearer <jwt>。 例如:

GET /foo HTTP/1.1
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyb2xlIjoiamRvZSIsImV4cCI6MTQ3NTUxNjI1MH0.GYDZV3yM0gqvuEtJmfpplLBXSGYnke_Pvnl0tbKAjB4

JWT Generation

您可以从数据库内部或通过外部服务创建有效的JWT。 每个令牌都使用秘密密码加密签名 - 签名者和验证者共享秘密。 因此,与PostgREST服务器共享密码的任何服务都可以创建有效的JWT。 (PostgREST目前仅支持HMAC-SHA256签名算法。)

JWT from SQL

您可以使用`pgjwt extension <https://github.com/michelp/pgjwt>`_在SQL中创建JWT令牌。 它很简单,只需要pgcrypto。 如果您使用的是不支持安装新扩展的Amazon RDS等环境,您仍然可以在pgjwt中手动运行SQL,从而创建您需要的功能。

接下来编写一个返回令牌的存储过程。 下面的一个返回一个带有硬编码角色的令牌,该角色在发布后五分钟到期。 请注意,此函数也有一个硬编码的密码。

CREATE TYPE jwt_token AS (
  token text
);

CREATE FUNCTION jwt_test() RETURNS public.jwt_token
    LANGUAGE sql
    AS $$
  SELECT sign(
    row_to_json(r), 'mysecret'
  ) AS token
  FROM (
    SELECT
      'my_role'::text as role,
      extract(epoch from now())::integer + 300 AS exp
  ) r;
$$;

PostgREST通过对`/rpc/jwt_test`进行POST请求,来向客户端暴露此函数(函数都是这样/rpc定义访问)。

注解

要避免对存储过程中的密钥进行硬编码,请将其另存为数据库的属性。

-- run this once
ALTER DATABASE mydb SET "app.jwt_secret" TO '!!secret!!';

-- then all functions can refer to app.jwt_secret
SELECT sign(
  row_to_json(r), current_setting('app.jwt_secret')
) AS token
FROM ...

JWT from Auth0

像Auth0 <https://auth0.com/>`_这样的外部服务可以将OAuth从Github,Twitter,Google等转变为适合PostgREST的JWT。 Auth0还可以处理电子邮件注册和密码重置流程。

要使用Auth0,请将其客户端密钥复制到PostgREST配置文件中,如下所示:code:jwt-secret。 (旧式的Auth0秘密是Base64编码的。对于这些秘密设置:代码:secret-is-base64 to:code:true,或者只刷新Auth0秘密。)你可以在客户端设置中找到秘密。 Auth0管理控制台。

我们的代码需要JWT中的数据库角色。要添加它,您需要将数据库角色保存在Auth0`app metadata <https://auth0.com/docs/rules/metadata-in-rules>`_中。然后,您将需要编写一个规则,该规则将从用户元数据中提取角色,并在我们的用户对象的有效负载中包含:code:`role`声明。然后,在您的Auth0Lock代码中,在您的`scope param <https://auth0.com/docs/libraries/lock/v10/sending-authentication-parameters#scope-string->中包含:code:`role`声明。

// Example Auth0 rule
function (user, context, callback) {
  user.app_metadata = user.app_metadata || {};
  user.role = user.app_metadata.role;
  callback(null, user, context);
}
// Example using Auth0Lock with role claim in scope
new Auth0Lock ( AUTH0_CLIENTID, AUTH0_DOMAIN, {
  container: 'lock-container',
  auth: {
    params: { scope: 'openid role' },
    redirectUrl: FQDN + '/login', // Replace with your redirect url
    responseType: 'token'
  }
})

JWT 安全

对于使用JWT,至少有三种常见的批评:1)针对标准本身,2)反对使用具有已知安全漏洞的库,以及3)反对使用JWT进行Web会话。我们将简要解释每个批评,PostgREST如何处理它,并为适当的用户操作提供建议。

关于“JWT标准<https://tools.ietf.org/html/rfc7519>`_的批评在网上其他地方详细说明<https://paragonie.com/blog/2017/03/jwt- JSON-Web的标记 - 是 - 坏标准是,每个人,应该规避>`_。 PostgREST最相关的部分是所谓的:代码:alg = none`问题。一些实现JWT的服务器允许客户端选择用于签署JWT的算法。在这种情况下,攻击者可以将算法设置为:code:`none,根本不需要任何签名,并获得未经授权的访问。但是,PostgREST的当前实现不允许客户端在HTTP请求中设置签名算法,从而使此攻击无关紧要。对标准的批评是它需要执行:code:alg = none

对JWT库的批评仅通过它使用的库与PostgREST相关。如上所述,不允许客户端在HTTP请求中选择签名算法会消除最大的风险。如果服务器使用RSA等非对称算法进行签名,则可能会发生另一种更微妙的攻击。这再次与PostgREST无关,因为它不受支持。好奇的读者可以在本文<https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/>`_中找到更多信息。有关在API客户端中使用的高质量库的建议,请参阅`jwt.io <https://jwt.io/>`_。

最后一种批评的重点是滥用JWT来维护网络会话。基本建议是“停止使用JWT进行会话<http://cryto.net/~joepie91/blog/2016/06/13/stop-using-jwt-for-sessions/>`_因为大多数(如果不是全部) ,当你做的时候出现的问题的解决方案,``不工作<http://cryto.net/~joepie91/blog/2016/06/19/stop-using-jwt-for-sessions-part-2-why - 您的溶液-犯规工作/>`_。链接的文章深入讨论了这些问题,但问题的实质是JWT并非设计为客户端存储的安全和有状态单元,因此不适合会话管理。

PostgREST主要使用JWT进行身份验证和授权,并鼓励用户也这样做。对于Web会话,使用基于HTTPS的cookie非常好,并且可以通过标准Web框架进行良好的迎合。 .. _ssl:

SSL

PostgREST旨在做好一件事:为PostgreSQL数据库添加HTTP接口。 为了保持代码小而集中,我们不实现SSL。 使用像NGINX这样的反向代理来添加它,“这里是如何<https://nginx.org/en/docs/http/configuring_https_servers.html>`_。 请注意,像Heroku这样的某些平台即服务也会在其负载均衡器中自动添加SSL。

架构隔离

PostgREST实例配置为公开服务器配置文件中指定的单个模式的所有表,视图和存储过程。 这意味着私有数据或实现细节可以进入私有模式,并且对HTTP客户端是不可见的。 然后,您可以公开视图和存储过程,从而将内部细节与外部世界隔离开来。 它使代码更容易重构,并提供了一种自然的方式来进行API版本控制。 有关使用公共视图包装私有表的示例,请参阅下面的:ref:`public_ui`部分。

SQL 用户管理

存储用户和密码

如上所述,外部服务可以提供用户管理并使用JWT与PostgREST服务器协调。 也可以完全通过SQL支持登录。 这是一项相当多的工作,所以准备好了。

下表,函数和触发器将存在于:code:`basic_auth`模式中,您不应在API中公开公开。 公共视图和函数将存在于不同的模式中,该模式在内部引用此内部信息。

首先,我们需要一个表来跟踪我们的用户:

-- 我们将内容置于basic_auth模式中,
-- 以将其隐藏在公共视图中。
-- 某些公共过程/视图将引用内部的帮助程序和表。
create schema if not exists basic_auth;

create table if not exists
basic_auth.users (
  email    text primary key check ( email ~* '^.+@.+\..+$' ),
  pass     text not null check (length(pass) < 512),
  role     name not null check (length(role) < 512)
);

我们希望该角色role是实际数据库角色的外键,但是PostgreSQL不支持对:code:`pg_roles`表的这些约束。 我们将使用触发器手动强制执行它。

create or replace function
basic_auth.check_role_exists() returns trigger
  language plpgsql
  as $$
begin
  if not exists (select 1 from pg_roles as r where r.rolname = new.role) then
    raise foreign_key_violation using message =
      'unknown database role: ' || new.role;
    return null;
  end if;
  return new;
end
$$;

drop trigger if exists ensure_user_role_exists on basic_auth.users;
create constraint trigger ensure_user_role_exists
  after insert or update on basic_auth.users
  for each row
  execute procedure basic_auth.check_role_exists();

接下来,我们将使用pgcrypto扩展和触发器来保密密码:code:`users`表。

create extension if not exists pgcrypto;

create or replace function
basic_auth.encrypt_pass() returns trigger
  language plpgsql
  as $$
begin
  if tg_op = 'INSERT' or new.pass <> old.pass then
    new.pass = crypt(new.pass, gen_salt('bf'));
  end if;
  return new;
end
$$;

drop trigger if exists encrypt_pass on basic_auth.users;
create trigger encrypt_pass
  before insert or update on basic_auth.users
  for each row
  execute procedure basic_auth.encrypt_pass();

使用该表,我们可以帮助检查加密列的密码。 如果电子邮件和密码正确,它将返回用户的数据库角色。

create or replace function
basic_auth.user_role(email text, pass text) returns name
  language plpgsql
  as $$
begin
  return (
  select role from basic_auth.users
   where users.email = user_role.email
     and users.pass = crypt(user_role.pass, users.pass)
  );
end;
$$;

Public 用户界面

在上一节中,我们创建了一个用于存储用户信息的内部表。 在这里,我们创建一个登录函数,它接受一个电子邮件地址和密码,如果凭据与内部表中的用户匹配,则返回JWT。

登录

如``JWT from SQL`_中所述,我们将在登录函数中创建一个JWT。 请注意,您需要将此示例中硬编码的密钥调整为您选择的安全密钥。

create or replace function
login(email text, pass text) returns basic_auth.jwt_token
  language plpgsql
  as $$
declare
  _role name;
  result basic_auth.jwt_token;
begin
  -- check email and password
  select basic_auth.user_role(email, pass) into _role;
  if _role is null then
    raise invalid_password using message = 'invalid user or password';
  end if;

  select sign(
      row_to_json(r), 'mysecret'
    ) as token
    from (
      select _role as role, login.email as email,
         extract(epoch from now())::integer + 60*60 as exp
    ) r
    into result;
  return result;
end;
$$;

调用此函数的API请求如下所示:

POST /rpc/login HTTP/1.1

{ "email": "foo@bar.com", "pass": "foobar" }

响应看起来像下面的代码段。 尝试在`jwt.io <https://jwt.io/>`_解码令牌。 (它的编码带有以下秘密:代码:mysecret,如上面的SQL代码中所指定的。你将会在你的应用程序中更改这个秘密!)

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlbWFpbCI6ImZvb0BiYXIuY29tIiwicm9sZSI6ImF1dGhvciJ9.fpf3_ERi5qbWOE5NPzvauJgvulm0zkIG9xSm2w5zmdw"
}

权限

您的数据库角色需要访问模式,表,视图和函数才能为HTTP请求提供服务。 回想一下“角色系统概述”_,PostgREST使用特殊角色来处理请求,即身份验证者和匿名角色。 以下是允许匿名用户创建帐户并尝试登录的权限示例。

-- 名称“anon”和“authenticator”是可配置的
-- 而不是关键词,我们只是为了清晰起见而选择它们
create role anon;
create role authenticator noinherit;
grant anon to authenticator;

grant usage on schema public, basic_auth to anon;
grant select on table pg_authid, basic_auth.users to anon;
grant execute on function login(text,text) to anon;

您可能会担心,匿名用户可以从:code:`basic_auth.users`表中读取所有内容。 但是,此表不适用于直接查询,因为它位于单独的架构中。 匿名角色需要访问,因为public:code:`users`视图使用调用用户的权限读取基础表。 但我们已确保视图正确限制对敏感信息的访问。