From 26aeda8364895666fad64bb4f05f6851bbb6351c Mon Sep 17 00:00:00 2001 From: Ian Thomas Date: Thu, 4 Jun 2026 14:57:00 +0100 Subject: [PATCH] Add docs on environment variables and wasm-specific behaviour --- docs/env_vars.md | 71 ++++++++++++++++++++++++++++++++++++ docs/index.md | 26 ++++++------- docs/wasm_build.md | 89 +++++++++++++++++++++++++++++++++++++++++++++ src/wasm/stream.cpp | 3 +- 4 files changed, 175 insertions(+), 14 deletions(-) create mode 100644 docs/env_vars.md create mode 100644 docs/wasm_build.md diff --git a/docs/env_vars.md b/docs/env_vars.md new file mode 100644 index 0000000..166cc1c --- /dev/null +++ b/docs/env_vars.md @@ -0,0 +1,71 @@ +# Environment variables + + +## In all builds + +The following four environment variables should be set for `git2cpp commit` and `git2cpp merge` +subcommands. The use of `git2cpp config` instead is partially supported and will be improved in +time. + +`GIT_AUTHOR_EMAIL` +: The email for the "author" field. + +`GIT_AUTHOR_NAME` +: The human-readable name in the "author" field. + +`GIT_COMMITTER_EMAIL` +: The email for the "committer" field. + +`GIT_COMMITTER_NAME` +: The human-readable name for the "committer" field. + + +## In WebAssembly build only + +(git_cors_proxy)= +`GIT_CORS_PROXY` +: In-browser remote `git2cpp` operations such as `clone`, `fetch` and `push` usually require use of + a [CORS proxy server](cors_proxy). Use this environment variable to specify how the target URL is + encoded into the CORS proxy URL, details of which depend on how the CORS proxy server is + implemented. + + The `GIT_CORS_PROXY` should contain the URL of the CORS proxy itself, followed by a number of + substitutions which are denoted by curly braces. To illustrate the substitutions, assume that the + `git2cpp` command is for the repository at `https://github.com/organisation/repository`. + + Substitutions: + + - `{host}` is replaced by `github.com` + - `{path}` is replaced by `/organisation/repository/` followed by extra information that depends + on details of the `git2cpp` operation being performed + - `{protocol}` is replaced by `https:` + - `{url}` is equivalent to `{protocol}//{host}{path}` + - `{api_key}` is replaced by the value of environment variable `GIT_CORS_PROXY_API_KEY` if it is + set. + + If no substitutions are specified then `{url}` is appended. + + All of the substitutions except `{api_key}` have an `:encode` variant such as `{url:encode}` + that passes the argument through the + [encodeURIComponent](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent) + JavaScript function, which some CORS proxies require. + + You can verify the actual URL used for requests in the Network tab of your browser's Developer + Tools for debugging purposes. + + See [CORS proxy server](cors_proxy) for usage examples. + +`GIT_CORS_PROXY_API_KEY` +: This value is used to replace the `{api_key}` in `GIT_CORS_PROXY` and is intended for use with a + CORS proxy that requires an API key. Alternatively the API key could be put directly in the + `GIT_CORS_PROXY` instead. + +(git_http_timeout)= +`GIT_HTTP_TIMEOUT` +: In the WebAssembly build, all http(s) requests are limited by a timeout which has a default of 10 + seconds. To use a different timeout set the `GIT_HTTP_TIMEOUT` environment variable. For example, + to set a timeout of 20 seconds use: + + ```bash + export GIT_HTTP_TIMEOUT=20 + ``` diff --git a/docs/index.md b/docs/index.md index 81f5be3..c8799bb 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,28 +1,28 @@ # Overview `git2cpp` is a C++ wrapper of [libgit2](https://libgit2.org/) to provide a command-line interface -(CLI) to `git` functionality. The intended use is in WebAssembly in-browser terminals (the -[cockle](https://github.com/jupyterlite/cockle) and -[JupyterLite terminal](https://github.com/jupyterlite/terminal) projects) but it can be compiled and +(CLI) to `git` functionality. The intended use is in WebAssembly in-browser terminals (see the +[cockle](https://github.com/jupyterlite/cockle), +[JupyterLite terminal](https://github.com/jupyterlite/terminal) and +[Notebook.link](https://notebook.link) projects) but it can be compiled and used on any POSIX-compliant system. The Help pages here are generated from the `git2cpp` command and subcommands to show the functionality that is currently supported. If there are features missing that you would like to use, please create an issue in the [git2cpp github repository](https://github.com/QuantStack/git2cpp). +The Appendix contains additional information on [Environment variables](env_vars.md) used in +`git2cpp` and about the behaviour of the [WebAssembly build](wasm_build.md). + ```{toctree} :caption: Help pages :hidden: created/git2cpp ``` -## Environment variables - -`GIT_HTTP_TIMEOUT` -: In the WebAssembly build, all http(s) requests are limited by a timeout which has a default of 10 - seconds. To use a different timeout set the `GIT_HTTP_TIMEOUT` environment variable. For example, - to set a timeout of 20 seconds use: - - ```bash - export GIT_HTTP_TIMEOUT=20 - ``` +```{toctree} +:caption: Appendix +:hidden: +env_vars +wasm_build +``` diff --git a/docs/wasm_build.md b/docs/wasm_build.md new file mode 100644 index 0000000..f5f452d --- /dev/null +++ b/docs/wasm_build.md @@ -0,0 +1,89 @@ +# WebAssembly build + +The in-browser WebAssembly build of `git2cpp` is intended to behave as similar as possible to other +builds but there are some differences when remotely accessing remote servers, in particular with +blocking requests and the requirement for a CORS proxy server. + + +## Blocking requests + +Remote `git2cpp` requests in non-WebAssembly builds are progressive and feedback is provided as the +data is streamed back from the remote. But in WebAssembly builds such remote requests are blocking +and no feedback can be provided until the entire response is received back from the remote. +This can be a long time to wait without feedback, and if there is an error such that a response is +not received it could block forever, leaving the in-browser terminal unusable. + +Hence the WebAssembly build limits http(s) requests with a timeout that defaults to 10 seconds. +This timeout can be increased using the [`GIT_HTTP_TIMEOUT` environment variable](git_http_timeout). + +In addition, when a `git2cpp` response is received that is larger than 10 MB, a prompt is presented +to the user to confirm whether to proceed to unpack the response or not. Note that the size of the +response may be smaller or larger than the size of the directory structure it unpacks to. + + +(cors_proxy)= +## CORS proxy server + +The fetching of resources in a browser from one domain to another is often limited by a browser +security feature called +[Cross-Origin Resource Sharing](https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/CORS) +(CORS). For this to be allowed the target domain must indicate it is happy to accept cross-origin +requests by adding certain headers to `https` responses. Most git servers such as `github.com` do +not add these headers, so a `git2cpp clone` from `github.com` will fail with a CORS error if run +from within a browser whereas there is no such limitation if run from a terminal of a real computer. + +The solution to this problem is to use a separate CORS proxy server. The `git2cpp` remote request is +sent to this intermediate server which send the request to the target server and as this request is +coming from outside a browser it is not subject to CORS restrictions. The proxy receives the +response from the target server and adds the required CORS headers before returning it to `git2cpp`. + +Various public CORS proxy servers are available for use or you serve your own. It can be useful to +serve your own on `localhost` when experimenting to confirm that everything works as expected +before moving on to a more complex solution. Be aware that the CORS proxy server is able to read the +content of your request so be careful if you are using authentication tokens. + +The [`GIT_CORS_PROXY` environment variable](git_cors_proxy) is used to specify how the target URL is +encoded in the CORS proxy URL. + +### Example running a local CORS proxy server + +If you are running a local [cockle](https://github.com/jupyterlite/cockle) or +[JupyterLite terminal](https://github.com/jupyterlite/terminal) deployment you can also run a local +CORS proxy such as [CORS Anywhere](https://github.com/Rob--W/cors-anywhere) to test it out. +In a separate terminal on your host machine on which you have `nodejs` available, `cd` to a new +clean directory, and download and run the CORS proxy server using: + +```js +npm install cors-anywhere +HOST=localhost PORT=8881 node node_modules/cors-anywhere/server.js +``` + +This will start the CORS proxy server listening on `http://localhost:8881/`. To use this in your +local `cockle` or `JupyterLite terminal` deployment in your browser set the `CORS_PROXY_URL` to be +```bash +export GIT_CORS_PROXY=http://localhost:8881/ +``` +and then try a `git2cpp clone` using something like: +```bash +git2cpp clone https://github.com/some-organisation/some-repository +``` + +### Example using a public CORS proxy server + +There is a public instance of [CORS Anywhere](https://github.com/Rob--W/cors-anywhere) available at +`https://cors-anywhere.herokuapp.com/`. This can be used for demonstration purposes but it requires +an explicit opt-in and your access will be time-limited. To request temporary access go to +`https://cors-anywhere.herokuapp.com/` and follow the instructions there. + +Once you have access, you can try this out in one of the public deployments such as those at +[https://jupyterlite.github.io/cockle](https://jupyterlite.github.io/cockle) or +[https://jupyterlite.github.io/terminal](https://jupyterlite.github.io/terminal). + +Set the `CORS_PROXY_URL` to be +```bash +export GIT_CORS_PROXY=https://cors-anywhere.herokuapp.com/ +``` +and then try a `git2cpp clone` using something like: +```bash +git2cpp clone https://github.com/some-organisation/some-repository +``` diff --git a/src/wasm/stream.cpp b/src/wasm/stream.cpp index b16dcc3..7b12109 100644 --- a/src/wasm/stream.cpp +++ b/src/wasm/stream.cpp @@ -116,7 +116,8 @@ EM_JS( ); EM_JS(const char*, js_maybe_convert_url, (const char* url_str), { - // Convert URL to use CORS proxy based on env vars GIT_CORS_PROXY and GIT_CORS_PROXY_TYPE. + // Convert URL to use CORS proxy based on env vars GIT_CORS_PROXY and possible + // GIT_CORS_PROXY_API_KEY. // If no conversion occurs, return the original unconverted URL as a new string. const url_js = UTF8ToString(url_str); const url = new URL(url_js);