diff --git a/manual/develop/README.md b/manual/develop/README.md index 47ef4f9e..c80869c0 100644 --- a/manual/develop/README.md +++ b/manual/develop/README.md @@ -13,4 +13,3 @@ Seafile Open API Seafile Implement Details * [Seafile Data Model](data_model.md) -* [Seafile Server Components](server-components.md) diff --git a/manual/develop/data_model.md b/manual/develop/data_model.md index 4fcf7645..c2848201 100644 --- a/manual/develop/data_model.md +++ b/manual/develop/data_model.md @@ -1,37 +1,66 @@ # Data Model -Seafile internally uses a data model similar to GIT's. It consists of `Repo`, `Branch`, `Commit`, `FS`, and `Block`. +Seafile internally uses a data model similar to GIT's. It consists of `Repo`, `Commit`, `FS`, and `Block`. + +Seafile's high performance comes from the architectural design: stores file metadata in object storage (or file system), while only stores small amount of metadata about the libraries in relational database. An overview of the architecture can be depicted as below. We'll describe the data model in more details. + +![Seafile architecture](./seafile_architecture.png) ## Repo A repo is also called a library. Every repo has an unique id (UUID), and attributes like description, creator, password. -## Branch +The metadata for a repo is stored in `seafile_db` database and the commit objects (see description in later section). -Unlike git, only two predefined branches is used, i.e., `local` and `master`. +There are a few tables in the `seafile_db` database containing important information about each repo. -In PC client, modifications will first be committed to the `local` branch. -Then the `master` branch is downloaded from server, and merged into `local` branch. -After that the `local` branch will be uploaded to server. Then the server will fast-forward -its `master` branch to the head commit of the just uploaded branch. - -When users update a repo on the web, modifications will first be committed to temporary branch -on the server, then merged into the `master` branch. +* `Repo`: contains the ID for each repo. +* `RepoOwner`: contains the owner id for each repo. +* `RepoInfo`: it is a "cache" table for fast access to repo metadata stored in the commit object. It includes repo name, update time, last modifier. +* `RepoSize`: the total size of all files in the repo. +* `RepoFileCount`: the file count in the repo. +* `RepoHead`: contains the "head commit ID". This ID points to the head commit in the storage, which will be described in the next section. ## Commit -Like in GIT. +Commit objects save the change history of a repo. Each update from the web interface, or sync upload operation will create a new commit object. A commit object contains the following information: commit ID, library name, creator of this commit (a.k.a. the modifier), creation time of this commit (a.k.a. modification time), root fs object ID, parent commit ID. + +The root fs object ID points to the root FS object, from which we can traverse a file system snapshot for the repo. + +The parent commit ID points to the last commit previous to the current commit. The `RepoHead` table contains the latest head commit ID for each repo. From this head commit, we can traverse the repo history. + +If you use file system as storage backend, commit objects are stored in the path `seafile-data/storage/commits/`. If you use object storage, commit objects are stored in the `commits` bucket. ## FS -There are two types of FS objects, `SeafDir Object` and `Seafile Object`. -`SeafDir Object` represents a directory, and `Seafile Object` represents a file. +There are two types of FS objects, `SeafDir Object` and `Seafile Object`. `SeafDir Object` represents a directory, and `Seafile Object` represents a file. + +The `SeafDir` object contains metadata for each file/sub-folder, which includes name, last modification time, last modifier, size, and object ID. The object ID points to another `SeafDir` or `Seafile` object. The `Seafile` object contains a block list, which is a list of block IDs for the file. + +The FS object IDs are calculated based on the contents of the object. That means if a folder or a file is not changed, the same objects will be reused across multiple commits. This allow us to create snapshots very efficiently. + +If you use file system as storage backend, commit objects are stored in the path `seafile-data/storage/fs/`. If you use object storage, commit objects are stored in the `fs` bucket. ## Block A file is further divided into blocks with variable lengths. We use Content Defined Chunking algorithm to divide file into blocks. A clear overview of this algorithm can be found at http://pdos.csail.mit.edu/papers/lbfs:sosp01/lbfs.pdf. -On average, a block's size is around 1MB. +On average, a block's size is around 8MB. This mechanism makes it possible to deduplicate data between different versions of frequently updated files, improving storage efficiency. It also enables transferring data to/from multiple servers in parallel. + +If you use file system as storage backend, commit objects are stored in the path `seafile-data/storage/blocks/`. If you use object storage, commit objects are stored in the `blocks` bucket. + +## Virtual Repo + +A "virtual repo" is a special repo that will be created in the cases below: + +* A folder in a library is shared. +* A folder in a library is synced selectively from the sync client. + +A virtual repo can be understood as a view for part of the data in its parent library. For example, when sharing a folder, the virtual repo only provides access to the shared folder in that library. Virtual repo use the same underlying data as the parent library. So virtual repos use the same `fs` and `blocks` storage location as its parent. + +Virtual repo has its own change history. So it has separate `commits` storage location from its parent. The changes in virtual repo and its parent repo will be bidirectional merged. So that changes from each side can be seen from another. + +There is a `VirtualRepo` table in `seafile_db` database. It contains the folder path in the parent repo for each virtual repo. diff --git a/manual/develop/seafile_architecture.png b/manual/develop/seafile_architecture.png new file mode 100644 index 00000000..6a024f3f Binary files /dev/null and b/manual/develop/seafile_architecture.png differ diff --git a/manual/develop/server-components.md b/manual/develop/server-components.md deleted file mode 100644 index bd0ab06d..00000000 --- a/manual/develop/server-components.md +++ /dev/null @@ -1,25 +0,0 @@ -# Components of Seafile Server - -Seafile server comprises of the following services. - -* **Ccnet daemon** (ccnet for client side or ccnet-server for server side):networking service daemon. In our initial design, Ccnet worked like a traffic bus. All the network traffic between client, server and internal traffic between different components would go through Ccnet. After further development we found that file transfer is improved by utilizing the Seafile daemon component directly. -* **Seafile daemon**:data service daemon -* **Seahub**:the website. Seafile server package contains a light-weight Python HTTP server `gunicorn` that serves the website. Seahub runs as an application within gunicorn. -* **FileServer**: handles raw file upload/download functions for Seahub. Due to Gunicorn being poor at handling large files, so we wrote this "FileServer" in the C programming language to serve raw file upload/download. -* **Controller**: monitors ccnet and Seafile daemons, restarts them if necessary. - -**The picture below shows how Seafile desktop client syncs files with Seafile server**: - -![seafile-sync-arch](../images/seafile-sync-arch.png) - -
- -**The picture below shows how Seafile mobile client interacts with Seafile server**: - -![mobile-arch](../images/mobile-arch.png) - -
- -**The picture below shows how Seafile mobile client interacts with Seafile server if the server is configured behind Nginx/Apache**: - -![mobile-nginx-arch](../images/mobile-nginx-arch.png) diff --git a/mkdocs.yml b/mkdocs.yml index f518bf0a..04485a90 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -170,7 +170,6 @@ nav: - Web API V2.1: develop/web_api_v2.1.md - PHP API: https://github.com/rene-s/Seafile-PHP-SDK - Data Model: develop/data_model.md - - Server Components: develop/server-components.md - ChangeLog: - Seafile Community Edition: changelog/server-changelog.md - Seafile Professional Edition: changelog/changelog-for-seafile-professional-server.md