Skip to content

API

install

install(*packages, o='', timeout=60)

Install package(s) using uv.

  • This function is a convenience interface of the uv command uv pip install --system ⟨o⟩ -- ⟨packages⟩. See also its convenience alias update().

Parameters:

  • packages (str, default: () ) –

    Specifiers of the packages to install. In addition to the uv-supported package specifiers, colab-assist provides a shorthand for installing packages in remote Git repositories via HTTP:

    [⟨auth⟩@][⟨host⟩/]⟨owner⟩/⟨repo⟩[@⟨ref⟩]
    

    • ⟨auth⟩ is optional authorization info for a private repository, such as a GitHub personal access token (PAT).

      • If ⟨auth⟩ is $, an input prompt for authorization info will spawn.

      • If ⟨auth⟩ is prefixed with $, the part after $ is treated as the name of a Colab Secret containing the authorization info. Currently this is the recommended way of managing private info on Colab.

      • Otherwise, ⟨auth⟩ is assumed to be the authorization info proper.

    • ⟨host⟩ is an optional specification of the remote repository host.

      • If ⟨host⟩ is not provided, the host domain name will be github.com.

      • If ⟨host⟩ is prefixed with $, it is treated as an abbreviation tag:

        • $gh for github.com.
        • $gl for gitlab.com.
        • $bb for bitbucket.org.
      • Otherwise ⟨host⟩ is assumed to be a host domain name.

    • ⟨owner⟩/⟨repo⟩ is the required identifier of the remote repository.

    • ⟨ref⟩ is an optional specification of a branch, a tag, or a commit.

  • o (str, default: '' ) –

    Additional options for the uv pip install command.

  • timeout (int | None, default: 60 ) –

    Timeout in seconds for the spawned subprocess.

    • None: No timeout.

Examples:

import colab_assist as A

# Install the latest versions of DuckDB and Polars.
# This is equivalent to `A.update("duckdb", "polars")`.
A.install("duckdb", "polars", o="-U")

# Use the PAT stored in the Colab Secret named `my-token`
# to install the Python package hosted in the `feat/foo` branch
# of the private GitHub repository `me/my-repo`.
A.install("$my-token@me/my-repo@feat/foo")
Source code in src/colab_assist/colab_assist.py
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
def install(*packages: str, o: str = "", timeout: int | None = 60) -> None:
    """Install package(s) using uv.

    - This function is a convenience interface of the uv command
        `uv pip install --system ⟨o⟩ -- ⟨packages⟩`.
        See also its convenience alias [`update()`][colab_assist.update].

    Args:
        packages: Specifiers of the packages to install.
            In addition to the uv-supported [package specifiers](
            https://docs.astral.sh/uv/pip/packages/#installing-a-package),
            colab-assist provides a shorthand for installing packages
            in remote Git repositories via HTTP:
            ```
            [⟨auth⟩@][⟨host⟩/]⟨owner⟩/⟨repo⟩[@⟨ref⟩]
            ```

            - `⟨auth⟩` is optional authorization info for a private repository,
                such as a [GitHub personal access token (PAT)](
                https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens).

                - If `⟨auth⟩` is `$`, an input prompt for authorization info will spawn.

                - If `⟨auth⟩` is prefixed with `$`, the part after `$` is treated as
                    the name of a [Colab Secret](https://stackoverflow.com/a/77737451)
                    containing the authorization info. Currently this is the recommended
                    way of managing private info on Colab.

                - Otherwise, `⟨auth⟩` is assumed to be the authorization info proper.

            - `⟨host⟩` is an optional specification of the remote repository host.

                - If `⟨host⟩` is not provided, the host domain name will be `github.com`.

                - If `⟨host⟩` is prefixed with `$`, it is treated as an abbreviation tag:
                    - `$gh` for `github.com`.
                    - `$gl` for `gitlab.com`.
                    - `$bb` for `bitbucket.org`.

                - Otherwise `⟨host⟩` is assumed to be a host domain name.

            - `⟨owner⟩/⟨repo⟩` is the required identifier of the remote repository.

            - `⟨ref⟩` is an optional specification of a branch, a tag, or a commit.

        o: Additional [options](https://docs.astral.sh/uv/reference/cli/#uv-pip-install)
            for the `uv pip install` command.

        timeout: Timeout in seconds for the spawned subprocess.

            - `None`: No timeout.

    Examples:
        ```py
        import colab_assist as A

        # Install the latest versions of DuckDB and Polars.
        # This is equivalent to `A.update("duckdb", "polars")`.
        A.install("duckdb", "polars", o="-U")

        # Use the PAT stored in the Colab Secret named `my-token`
        # to install the Python package hosted in the `feat/foo` branch
        # of the private GitHub repository `me/my-repo`.
        A.install("$my-token@me/my-repo@feat/foo")
        ```
    """
    if not packages:
        return

    try:
        result = subprocess.run(
            tuple(
                chain(
                    ("uv", "pip", "install", "--system"),
                    split(o),
                    ("--",),
                    (_parse_package_spec(p) for p in packages),
                )
            ),
            capture_output=True,
            encoding="utf-8",
            timeout=timeout,
        )
    except subprocess.TimeoutExpired as exc:
        print(exc)
    else:
        if result.returncode != 0:
            print(result.stderr, end="")

update

update(*packages, o='', timeout=60)

Update package(s) and dependencies using uv.

  • This is a convenience alias of install() with --upgrade included in o. The command uv pip install used by install() is relatively conservative by default: Already installed packages will not be updated unless an update is required to satisfy an explicit version restriction or to resolve an incompatibility. With --upgrade, uv will always try to update packages and their dependencies to the latest versions, but this also increases the risk of breaking the Colab environment.
Source code in src/colab_assist/colab_assist.py
134
135
136
137
138
139
140
141
142
143
144
145
146
def update(*packages: str, o: str = "", timeout: int | None = 60) -> None:
    """Update package(s) and dependencies using uv.

    - This is a convenience alias of [`install()`][colab_assist.install]
        with `--upgrade` included in `o`. The command `uv pip install`
        used by `install()` is relatively conservative by default:
        Already installed packages will not be updated unless an update is required to
        satisfy an explicit version restriction or to resolve an incompatibility.
        With `--upgrade`, uv will always try to update packages and their dependencies
        to the latest versions, but this also increases the risk of
        breaking the Colab environment.
    """
    install(*packages, o="-U " + o, timeout=timeout)

clone

clone(remote, basename=None, *, o='', x='', timeout=60)

Clone a Git repository and optionally make the Python package in it importable.

  • This function is a convenience interface of the Git command git clone ⟨o⟩ -- ⟨remote⟩ "/content/repos/⟨basename⟩". When cloning a Python package, you can enable extra options via parameter x to automatically make the cloned package importable.

  • This function is mainly designed to clone via HTTP and work with the shorthand remote format detailed in install():

    [⟨auth⟩@][⟨host⟩/]⟨owner⟩/⟨repo⟩[@⟨branch⟩]
    
    Any unrecognized remote argument is assumed to be a valid Git URL and passed as is to git clone.

Parameters:

  • remote (str) –

    Specifier of the remote Git repository to clone.

  • basename (str | None, default: None ) –

    Directory name of the clone.

    • None: Use the name of the remote repository.
  • o (str, default: '' ) –

    Additional options for the git clone command.

  • x (str, default: '' ) –

    A string as an order-agnostic set of single-letter extra option flags. An option is enabled if and only if its corresponding letter is in the string.

    • p for path: This option should not be used together with e. This option assumes the GitHub repository hosts a Python package, and will add its top-level module directory to sys.path. This allows importing the package without installing it, which may help avoiding dependency conflicts with Colab-preinstalled packages.

      Notable implications include:

      • The cloned package is immediately importable without a session restart.
      • Changes in the clone (e.g. by pull()) can take effect via reload(); a session restart is not mandatory.
      • If a Colab session restart is triggered by restart(), colab_assist module will try to recover sys.path upon import. But otherwise you will need to manually re-add the top-module directory to sys.path after a session restart.
    • e for editable: This option should not be used together with p. This option assumes the GitHub repository hosts a Python package, and will install the clone in editable/development mode.

      Notable implications include:

      • Currently an editable install requires a session restart to take effect.
      • But after the installation, changes in the clone can take effect via reload(); a session restart is not mandatory.
      • Unlike sys.path, editable install is not reset by session restarts.
  • timeout (int | None, default: 60 ) –

    Timeout in seconds for the spawned subprocess.

    • None: No timeout.

Examples:

import colab_assist as A

# Use the PAT stored in the Colab Secret named `my-token`
# to clone the `feat/foo` branch of the private GitHub repository `me/my-repo`
# into directory `/content/repos/foo/`,
# and then install the clone as an editable package.
A.clone("$my-token@me/my-repo@feat/foo", "foo", x="e")
Source code in src/colab_assist/colab_assist.py
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
def clone(
    remote: str,
    basename: str | None = None,
    *,
    o: str = "",
    x: str = "",
    timeout: int | None = 60,
) -> None:
    """Clone a Git repository and optionally make the Python package in it importable.

    - This function is a convenience interface of the Git command
        `git clone ⟨o⟩ -- ⟨remote⟩ "/content/repos/⟨basename⟩"`.
        When cloning a Python package, you can enable extra options via parameter `x`
        to automatically make the cloned package importable.

    - This function is mainly designed to clone via HTTP and work with
        the shorthand `remote` format detailed in [`install()`][colab_assist.install]:
        ```
        [⟨auth⟩@][⟨host⟩/]⟨owner⟩/⟨repo⟩[@⟨branch⟩]
        ```
        Any unrecognized `remote` argument is assumed to be a valid [Git URL](
        https://git-scm.com/docs/git-clone#_git_urls) and passed as is to `git clone`.

    Args:
        remote: Specifier of the remote Git repository to clone.

        basename: Directory name of the clone.

            - `None`: Use the name of the remote repository.

        o: Additional [options](https://git-scm.com/docs/git-clone#_options)
            for the `git clone` command.

        x: A string as an order-agnostic set of single-letter extra option flags.
            An option is enabled if and only if its corresponding letter is in the string.

            - `p` for _path_:
                This option should not be used together with `e`.
                This option assumes the GitHub repository hosts a Python package,
                and will add its top-level module directory to `sys.path`.
                This allows importing the package without installing it, which may help
                avoiding dependency conflicts with Colab-preinstalled packages.

                Notable implications include:

                - The cloned package is immediately importable without a session restart.
                - Changes in the clone (e.g. by [`pull()`][colab_assist.pull])
                    can take effect via [`reload()`][colab_assist.reload];
                    a session restart is not mandatory.
                - If a Colab session restart is triggered by
                    [`restart()`][colab_assist.restart], `colab_assist` module
                    will try to recover `sys.path` upon import. But otherwise
                    you will need to manually re-add the top-module directory
                    to `sys.path` after a session restart.

            - `e` for _editable_:
                This option should not be used together with `p`.
                This option assumes the GitHub repository hosts a Python package,
                and will install the clone in [editable/development mode](
                https://setuptools.pypa.io/en/latest/userguide/development_mode.html).

                Notable implications include:

                - Currently an editable install requires a session restart to take effect.
                - But after the installation, changes in the clone can take effect
                    via [`reload()`][colab_assist.reload];
                    a session restart is not mandatory.
                - Unlike `sys.path`, editable install is not reset by session restarts.

        timeout: Timeout in seconds for the spawned subprocess.

            - `None`: No timeout.

    Examples:
        ```py
        import colab_assist as A

        # Use the PAT stored in the Colab Secret named `my-token`
        # to clone the `feat/foo` branch of the private GitHub repository `me/my-repo`
        # into directory `/content/repos/foo/`,
        # and then install the clone as an editable package.
        A.clone("$my-token@me/my-repo@feat/foo", "foo", x="e")
        ```
    """
    remote_rgx = (
        r"(?:(\$?[-%+.:\w]*)@)?"  # auth
        r"(?:\$(gh|gl|bb)/|([-a-zA-Z0-9]+\.[-.a-zA-Z0-9]+)/)?"  # host
        r"([-\w]+)/([-.\w]+)(?:@([-./\w]+))?"  # owner/repo@branch
    )

    if matched := re.fullmatch(remote_rgx, remote):
        auth, host_tag, host_name, owner, repo, branch = matched.groups()

        if os.path.exists(repo_path := os.path.join(_REPOS_ROOT, basename or repo)):
            print(
                f"{repo_path} already exists. Use `pull('{basename or repo}')` instead?"
            )
            return

        host_name = _HOST_NAMES.get(host_tag) or host_name or "github.com"

        if auth:
            url = f"https://{_get_auth(auth)}@{host_name}/{owner}/{repo}.git"
        else:
            url = f"https://{host_name}/{owner}/{repo}.git"

        if branch:
            cmd = tuple(
                chain(("git", "clone", "-b", branch), split(o), ("--", url, repo_path))
            )
        else:
            cmd = tuple(chain(("git", "clone"), split(o), ("--", url, repo_path)))
    else:
        if basename:
            repo_path = os.path.join(_REPOS_ROOT, basename)
        elif matched := re.search(r"/([-.\w]+)/?$", remote):
            repo = matched.group(1)
            if repo.endswith(".git"):
                repo = repo[:-4]
            repo_path = os.path.join(_REPOS_ROOT, repo)
        else:
            print(
                f"Failed to infer `basename` from {remote}. "
                "Please provide `basename` or consider using `!git clone` instead."
            )
            return

        if os.path.exists(repo_path):
            print(
                f"{repo_path} already exists."
                f"Consider `pull('{os.path.basename(repo_path)}')` instead?"
            )
            return

        cmd = tuple(chain(("git", "clone"), split(o), ("--", remote, repo_path)))

    try:
        result = subprocess.run(
            cmd, capture_output=True, encoding="utf-8", timeout=timeout
        )
    except subprocess.TimeoutExpired as exc:
        print(exc)
        return

    if result.returncode != 0:
        print(result.stderr, end="")
        return

    if "e" in x:
        _install_editable(repo_path, timeout)
        return

    if "p" in x:
        if os.path.isdir(src_path := os.path.join(repo_path, "src")):
            sys.path.append(src_path)
            _colab._sys_path_extensions.append(src_path)
        else:
            sys.path.append(repo_path)
            _colab._sys_path_extensions.append(repo_path)
        return

pull

pull(basename, *, o='', timeout=60)

Pull from a remote Git repository into its clone on Colab.

  • This function is a convenience interface for using the Git command git pull ⟨o⟩ in the repository directory /content/repos/⟨basename⟩/.

Parameters:

  • basename (str) –

    Base directory name of the clone in /content/repos/.

  • o (str, default: '' ) –

    Additional options for the git pull command.

  • timeout (int | None, default: 60 ) –

    Timeout in seconds for the spawned subprocess.

    • None: No timeout.
Source code in src/colab_assist/colab_assist.py
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
def pull(basename: str, *, o: str = "", timeout: int | None = 60) -> None:
    """Pull from a remote Git repository into its clone on Colab.

    - This function is a convenience interface for using the Git command `git pull ⟨o⟩`
        in the repository directory `/content/repos/⟨basename⟩/`.

    Args:
        basename: Base directory name of the clone in `/content/repos/`.

        o: Additional [options](https://git-scm.com/docs/git-pull#_options)
            for the `git pull` command.

        timeout: Timeout in seconds for the spawned subprocess.

            - `None`: No timeout.
    """
    repo_path = os.path.join(_REPOS_ROOT, basename)
    if not os.path.isdir(repo_path):
        print(f"{repo_path} does not exist or is not a directory.")
        return

    try:
        result = subprocess.run(
            ["git", "pull"] + split(o),
            cwd=repo_path,
            capture_output=True,
            encoding="utf-8",
            timeout=timeout,
        )
    except subprocess.TimeoutExpired as exc:
        print(exc)
    else:
        if result.returncode != 0:
            print(result.stderr, end="")

reload

reload(obj)

Reimport a module, function, or class.

  • This function internally uses importlib.reload() to reimport modules, and getattr() to retrieve attributes from reimported modules. So the limitations and caveats of importlib.reload() persist. In particular:

    • Reloading a module does not auto-reload its parent modules or submodules.
    • Names defined in the old version of the module but not in the new version (e.g. when an attribute is removed in an update) are not auto-deleted.
    • This function can correctly reload a non-module object only if the object has valid __name__ and __module__ attributes. So usually functions and classes are the only directly reloadable non-modules. However, after a module is reimported, its non-module attributes can be updated via import statements.
    • Reloading a class does not affect previously created instances.
    • To properly reload a non-module, the return value must be captured. It is recommended to always use the xxx = reload(xxx) pattern.
  • This function should mainly be used on modules, functions, or classes in your package for an update to take effect, and only when the changes are localized enough that restarting the Colab session is overkill.

  • See also %autoreload for automatically reloading multiple or all modules at once.

Parameters:

  • obj (object) –

    Object to reload. Usually should be a module, function, or class.

Returns:

  • object

    The reloaded object if reloading is successful. Otherwise, the original object is returned as is.

Examples:

import colab_assist as A
from my_pkg import my_func, MyClass

# (Behavior before update)

# (Update made to the source code of `my_pkg` on GitHub)
A.install("my_name/my_pkg")

my_func = A.reload(my_func)
MyClass = A.reload(MyClass)
my_obj = MyClass()

# (Updated behavior)
Source code in src/colab_assist/colab_assist.py
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
def reload(obj: object) -> object:
    """Reimport a module, function, or class.

    - This function internally uses [`importlib.reload()`](
        https://docs.python.org/3/library/importlib.html#importlib.reload)
        to reimport modules, and [`getattr()`](
        https://docs.python.org/3/library/functions.html#getattr)
        to retrieve attributes from reimported modules.
        So the limitations and caveats of `importlib.reload()` persist. In particular:

        - Reloading a module does not auto-reload its parent modules or submodules.
        - Names defined in the old version of the module but not in the new version
            (e.g. when an attribute is removed in an update) are not auto-deleted.
        - This function can correctly reload a non-module object
            only if the object has valid `__name__` and `__module__` attributes.
            So usually functions and classes are the only directly reloadable non-modules.
            However, after a module is reimported,
            its non-module attributes can be updated via `import` statements.
        - Reloading a class does not affect previously created instances.
        - To properly reload a non-module, the return value must be captured.
            It is recommended to always use the `xxx = reload(xxx)` pattern.

    - This function should mainly be used on modules, functions, or classes
        in your package for an update to take effect, and only when the changes are
        localized enough that restarting the Colab session is overkill.

    - See also [`%autoreload`](
        https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html)
        for automatically reloading multiple or all modules at once.

    Args:
        obj: Object to reload. Usually should be a module, function, or class.

    Returns:
        The reloaded object if reloading is successful.
            Otherwise, the original object is returned as is.

    Examples:
        ```py
        import colab_assist as A
        from my_pkg import my_func, MyClass

        # (Behavior before update)

        # (Update made to the source code of `my_pkg` on GitHub)
        A.install("my_name/my_pkg")

        my_func = A.reload(my_func)
        MyClass = A.reload(MyClass)
        my_obj = MyClass()

        # (Updated behavior)
        ```
    """
    if (name := getattr(obj, "__name__", None)) is None:
        print(f"Failed to reload {obj}: Missing attribute `__name__`.")
        return obj

    if isinstance(obj, ModuleType):  # [inspect.ismodule()](https://is.gd/7slO1C)
        if name == "__main__":
            print(f"Failed to reload {obj}: Cannot reload top-level module.")
            return obj
        return importlib.reload(obj)

    if (module_name := getattr(obj, "__module__", None)) is None:
        print(f"Failed to reload {obj}: Failed to determine the object's module.")
        return obj

    if module_name == "__main__":
        print(f"Failed to reload {obj}: Cannot reload objects defined at top level.")
        return obj

    return getattr(importlib.reload(sys.modules[module_name]), name)

secret

secret(name)

Retrieve Colab Secret.

  • This function is a convenience alias of google.colab.userdata.get().

Parameters:

  • name (str) –

    Name of a Colab Secret accessible to the Colab notebook.

Returns:

  • str

    The content of the Colab Secret.

Source code in src/colab_assist/colab_assist.py
422
423
424
425
426
427
428
429
430
431
432
433
434
def secret(name: str) -> str:
    """Retrieve Colab Secret.

    - This function is a convenience alias of `google.colab.userdata.get()`.

    Args:
        name: Name of a [Colab Secret](https://stackoverflow.com/a/77737451)
            accessible to the Colab notebook.

    Returns:
        The content of the Colab Secret.
    """
    return _colab.userdata.get(name)

edit

edit(path, *, x='')

Open an editor tab for a file.

  • This function wraps google.colab.files.view(), which currently does not support editing certain file types, e.g. .md.

Parameters:

  • path (str) –

    Path to the text file to edit.

  • x (str, default: '' ) –

    A string as an order-agnostic set of single-letter extra option flags. An option is enabled if and only if its corresponding letter is in the string.

    • c for create: By default, edit() does not create a new file. This option creates a blank file (and all its parent directories if necessary) if path does not exist.
Source code in src/colab_assist/colab_assist.py
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
def edit(path: str, *, x: str = "") -> None:
    """Open an editor tab for a file.

    - This function wraps `google.colab.files.view()`,
        which currently does not support editing certain file types, e.g. `.md`.

    Args:
        path: Path to the text file to edit.

        x: A string as an order-agnostic set of single-letter extra option flags.
            An option is enabled if and only if its corresponding letter is in the string.

            - `c` for _create_:
                By default, `edit()` does not create a new file.
                This option creates a blank file
                (and all its parent directories if necessary) if `path` does not exist.
    """
    if not os.path.exists(path):
        if "c" in x:
            if parent := os.path.dirname(path):
                os.makedirs(parent, exist_ok=True)
            open(path, "w").close()  # os.mknod() is not implemented for Google Drive.
        else:
            print(f"{path} does not exist.")
            return

    if os.path.isfile(path):
        _colab.files.view(path)
    else:
        print(f"{path} is not a file.")

download

download(url, path=None, *, chunk_size=131072)

Download a file from a URL.

Parameters:

  • url (str) –

    URL of the file to download.

  • path (str | None, default: None ) –

    Destination path of the downloaded file.

    • None: The file is saved in the current working directory and the file name is inferred from the response headers or the URL.
  • chunk_size (int, default: 131072 ) –

    Number of bytes read into memory while iterating over the response.

Returns:

  • str | None

    Absolute path of the downloaded file, or None if the download failed.

Source code in src/colab_assist/colab_assist.py
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
def download(
    url: str, path: str | None = None, *, chunk_size: int = 131072
) -> str | None:
    """Download a file from a URL.

    Args:
        url: URL of the file to download.

        path: Destination path of the downloaded file.

            - `None`: The file is saved in the current working directory
                and the file name is inferred from the response headers or the URL.

        chunk_size: Number of bytes read into memory while iterating over the response.

            - This argument is passed directly to [`request.Response.iter_content()`](
                https://requests.readthedocs.io/en/latest/api/#requests.Response.iter_content).

    Returns:
        Absolute path of the downloaded file, or `None` if the download failed.
    """
    with requests.get(url, stream=True) as resp:
        if resp.status_code != 200:
            print(f"Status {resp.status_code}: {_get_resp_reason(resp)}")
            return

        if path is None:
            if h := resp.headers.get("Content-Disposition"):
                em = EmailMessage()
                em["Content-Disposition"] = h
                path = f"{os.getcwd()}/{em.get_filename()}"
            elif filename := os.path.basename(urlparse(url).path):
                path = f"{os.getcwd()}/{filename}"
            else:
                print(f"Failed to infer file name from {url}. Please specify `path`.")
                return

        file_size = int(resp.headers.get("Content-Length", 0))
        with (
            open(path, "wb") as file,
            tqdm(
                total=file_size,
                unit="iB",
                unit_scale=True,
                unit_divisor=1024,
            ) as bar,
        ):
            for chunk in resp.iter_content(chunk_size=chunk_size):
                bar.update(file.write(chunk))

    return path

restart

restart()

Trigger a Colab session restart after some bookkeeping operations.

  • Colab is expected to issue a notification: "Your session crashed for an unknown reason."
  • Explicitly restarting the Colab session, e.g., with this function, with exit(), or with the Restart session command in the Colab Runtime menu, resets all imports and variables in the Python interpreter session. However, as long as the runtime is not deleted, the installed packages, files in the virtual disk, and Google Drive (if mounted) are preserved. This function does some extra bookkeeping so that certain session states can be recovered upon importing colab_assist in the next session.
Source code in src/colab_assist/colab_assist.py
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
def restart() -> None:
    """Trigger a Colab session restart after some bookkeeping operations.

    - Colab is expected to issue a notification:
        "Your session crashed for an unknown reason."
    - Explicitly restarting the Colab session, e.g., with this function,
        with `exit()`, or with the `Restart session` command in the Colab `Runtime` menu,
        resets all imports and variables in the Python interpreter session.
        However, as long as the runtime is not deleted, the installed packages,
        files in the virtual disk, and Google Drive (if mounted) are preserved.
        This function does some extra bookkeeping so that certain session states
        can be recovered upon importing `colab_assist` in the next session.
    """
    _colab._save_state()

    if (ishell := get_ipython()) is None:
        print(
            "Interactive shell not found. Try `exit()` or `Runtime -> Restart session`."
        )
    else:
        ishell.ask_exit()  # type: ignore

mount

mount(force=False)

Mount Google Drive.

Parameters:

  • force (bool, default: False ) –

    Option to force remounting if Google Drive is already mounted.

Source code in src/colab_assist/colab_assist.py
545
546
547
548
549
550
551
def mount(force: bool = False) -> None:
    """Mount Google Drive.

    Args:
        force: Option to force remounting if Google Drive is already mounted.
    """
    _colab.drive.mount(_DRIVE_MNTPT, force_remount=force)

unmount

unmount()

Flush and unmount Google Drive.

Source code in src/colab_assist/colab_assist.py
554
555
556
def unmount() -> None:
    """Flush and unmount Google Drive."""
    _colab.drive.flush_and_unmount()

end

end()

Terminate the Colab runtime after some cleanup operations.

Source code in src/colab_assist/colab_assist.py
559
560
561
562
563
def end() -> None:
    """Terminate the Colab runtime after some cleanup operations."""
    _clear_repos()
    _colab.drive.flush_and_unmount()
    _colab.runtime.unassign()

update_git

update_git(timeout=90)

Update Git.

  • Current implementation uses APT, so the update is relatively slow (about 1 min).

Parameters:

  • timeout (int | None, default: 90 ) –

    Timeout in seconds for the spawned subprocess.

    • None: No timeout.
Source code in src/colab_assist/colab_assist.py
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
def update_git(timeout: int | None = 90) -> None:
    """Update Git.

    - Current implementation uses APT, so the update is relatively slow (about 1 min).

    Args:
        timeout: Timeout in seconds for the spawned subprocess.

            - `None`: No timeout.
    """
    if _colab._git_updated:
        return

    try:
        result = subprocess.run(
            "add-apt-repository -y 'ppa:git-core/ppa' && apt-get -y install git",
            shell=True,  # For `&&`.
            capture_output=True,
            encoding="utf-8",
            timeout=timeout,
        )
    except subprocess.TimeoutExpired as exc:
        print(exc)
    else:
        if result.returncode != 0:
            print(result.stderr, end="")
        else:
            _colab._git_updated = True