-
Notifications
You must be signed in to change notification settings - Fork 16.9k
Expand file tree
/
Copy path03_developer_tasks.rst
More file actions
761 lines (500 loc) · 30 KB
/
03_developer_tasks.rst
File metadata and controls
761 lines (500 loc) · 30 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Regular development tasks
=========================
The regular Breeze development tasks are available as top-level commands. Those tasks are most often
used during the development, that's why they are available without any sub-command. More advanced
commands are separated to sub-commands.
**The outline for this document in GitHub is available at top-right corner button (with 3-dots and 3 lines).**
Entering Breeze shell
---------------------
This is the most often used feature of breeze. It simply allows to enter the shell inside the Breeze
development environment (inside the Breeze container).
You can use additional ``breeze`` flags to choose your environment. You can specify a Python
version to use, and backend (the meta-data database). Thanks to that, with Breeze, you can recreate the same
environments as we have in matrix builds in the CI. See next chapter for backend selection.
For example, you can choose to run Python 3.10 tests with MySQL as backend and with mysql version 8
as follows:
.. code-block:: bash
breeze --python 3.10 --backend mysql --mysql-version 8.0
.. note:: Note for Windows WSL2 users
You may find error messages:
.. code-block:: bash
Current context is now "..."
protocol not available
Error 1 returned
Try adding ``--builder=default`` to your command. For example:
.. code-block:: bash
breeze --builder=default --python 3.10 --backend mysql --mysql-version 8.0
The choices you make are persisted in the ``./.build/`` cache directory so that next time when you use the
``breeze`` script, it could use the values that were used previously. This way you do not have to specify
them when you run the script. You can delete the ``.build/`` directory in case you want to restore the
default settings.
You can also run breeze with ``SKIP_SAVING_CHOICES`` to non-empty value and breeze invocation will not save
used cache value to cache - this is useful when you run non-interactive scripts with ``breeze shell`` and
want to - for example - force Python version used only for that execution without changing the Python version
that user used last time.
You can see which value of the parameters that can be stored persistently in cache marked with >VALUE<
in the help of the commands (for example in output of ``breeze config --help``).
Selecting Backend
-----------------
When you run breeze commands, you can additionally select which backend you want to use. Currently Airflow
supports Sqlite, MySQL and Postgres as backends - MySQL and Postgres are supported in various versions.
You can choose which backend to use by adding ``--backend`` flag and additionally you can select version
of the backend, if you want to start a different version of backend (for example for ``--backend postgres``
you can specify ``--postgres-version 13`` to start Postgres 13). The ``--help`` command in breeze commands
will show you which backends are supported and which versions are available for each backend.
The choice you made for backend and version are ``sticky`` - the last used selection is cached in the
``.build`` folder. Next time you run any of the ``breeze`` commands that use backend, it will use the
last selected backend and version.
.. note::
You can also (temporarily for the time of running a single command) override the backend version
used via ``BACKEND_VERSION`` environment variable. This is used mostly in CI where we have common way of
running tests for all backends and we want to specify different parameters. In order to override the
backend version, it has to be a valid version for the backend you are using. For example if you set
``BACKEND_VERSION`` to ``13`` and you are using ``--backend postgres``, Postgres 13 will be used, but
if you set ``BACKEND_VERSION`` to ``8.0`` and you are using ``--backend postgres``, the last used Postgres
version will be used.
Breeze will inform you at startup which backend and version it is using:
.. raw:: html
<div align="center">
<img src="images/version_information.png" width="640" alt="Version information printed by Breeze">
</div>
Port Forwarding
---------------
When you run Airflow Breeze, the following ports are automatically forwarded:
.. code-block::
* 12322 -> forwarded to Airflow ssh server -> airflow:22
* 28080 -> forwarded to Airflow API server -> airflow:8080
* 25555 -> forwarded to Flower dashboard -> airflow:5555
* 25433 -> forwarded to Postgres database -> postgres:5432
* 23306 -> forwarded to MySQL database -> mysql:3306
* 26379 -> forwarded to Redis broker -> redis:6379
You can connect to these ports/databases using:
.. code-block::
* ssh connection for remote debugging: ssh -p 12322 airflow@localhost pw: airflow
* API server: http://localhost:28080
* Flower: http://localhost:25555
* Postgres: jdbc:postgresql://localhost:25433/airflow?user=postgres&password=airflow
* Mysql: jdbc:mysql://localhost:23306/airflow?user=root
* Redis: redis://localhost:26379/0
If you do not use ``start-airflow`` command. You can use ``tmux`` to multiply terminals.
You may need to create a user prior to running the API server in order to log in.
**Authentication and User Management**
The authentication method depends on which auth manager is configured:
**SimpleAuthManager (Default in Airflow 3.x)**
SimpleAuthManager is the default authentication manager and comes pre-configured with test username and passwords for development:
.. code-block::
* admin:admin (Admin role)
* viewer:viewer (Viewer role)
* user:user (User role)
* op:op (Operator role)
These users are automatically available when using SimpleAuthManager and require no additional setup.
**FabAuthManager**
When using FabAuthManager, you can create users manually:
.. code-block:: bash
airflow users create --role Admin --username admin --password admin --email admin@example.com --firstname foo --lastname bar
Or use the ``--create-all-roles`` flag with ``start-airflow`` in dev mode to automatically create test users:
.. code-block:: bash
breeze start-airflow --dev-mode --create-all-roles --auth-manager FabAuthManager
This will create the following test users:
.. code-block::
* admin:admin (Admin role)
* viewer:viewer (Viewer role)
* user:user (User role)
* op:op (Op role)
* testadmin:testadmin (Admin role)
.. note::
``airflow users`` command is only available when `FAB auth manager <https://airflow.apache.org/docs/apache-airflow-providers-fab/stable/auth-manager/index.html>`_ is enabled.
For databases, you need to run ``airflow db reset`` at least once (or run some tests) after you started
Airflow Breeze to get the database/tables created. You can connect to databases with IDE or any other
database client:
.. raw:: html
<div align="center">
<img src="images/database_view.png" width="640" alt="Airflow Breeze - Database view">
</div>
You can change the used host port numbers by setting appropriate environment variables:
* ``SSH_PORT``
* ``WEB_HOST_PORT`` - API server when --use-airflow-version is used
* ``POSTGRES_HOST_PORT``
* ``MYSQL_HOST_PORT``
* ``MSSQL_HOST_PORT``
* ``FLOWER_HOST_PORT``
* ``REDIS_HOST_PORT``
* ``RABBITMQ_HOST_PORT``
If you set these variables, next time when you enter the environment the new ports should be in effect.
Remote Debugging in IDE
-----------------------
One of the possibilities (albeit only easy if you have a paid version of IntelliJ IDEs for example) with
Breeze is an option to run remote debugging in your IDE graphical interface.
When you run tests, airflow, example Dags, even if you run them using unit tests, they are run in a separate
container. This makes it a little harder to use with IDE built-in debuggers.
Fortunately, IntelliJ/PyCharm provides an effective remote debugging feature (but only in paid versions).
See additional details on
`remote debugging <https://www.jetbrains.com/help/pycharm/remote-debugging-with-product.html>`_.
You can set up your remote debugging session as follows:
.. image:: images/setup_remote_debugging.png
:align: center
:alt: Setup remote debugging
Note that on macOS, you have to use a real IP address of your host rather than the default
localhost because on macOS the container runs in a virtual machine with a different IP address.
Make sure to configure source code mapping in the remote debugging configuration to map
your local sources to the ``/opt/airflow`` location of the sources within the container:
.. image:: images/source_code_mapping_ide.png
:align: center
:alt: Source code mapping
.. note::
For comprehensive debugging documentation using the new ``--debug`` and ``--debugger`` flags
with VSCode and debugpy, see the `Debugging Airflow Components <../../contributing-docs/20_debugging_airflow_components.rst>`__
guide.
Building the documentation
--------------------------
To build documentation in Breeze, use the ``build-docs`` command:
.. code-block:: bash
breeze build-docs
Results of the build can be found in the ``generated/_build`` folder.
The documentation build consists of three steps:
* verifying consistency of indexes
* building documentation
* spell checking
You can choose only one stage of the two by providing ``--spellcheck-only`` or ``--docs-only`` after
extra ``--`` flag.
.. code-block:: bash
breeze build-docs --spellcheck-only
This process can take some time, so in order to make it shorter you can filter by package, using package
short ``provider id`` (might be multiple of them).
.. code-block:: bash
breeze build-docs <provider id> <provider id>
To build documentation for Task SDK package, use the below command
.. code-block:: bash
breeze build-docs task-sdk
or you can use package filter. The filters are glob pattern matching full
package names and can be used to select more than one package with single filter.
.. code-block:: bash
breeze build-docs --package-filter apache-airflow-providers-*
Inventory cache handling
^^^^^^^^^^^^^^^^^^^^^^^^
When building documentation, Sphinx downloads intersphinx inventories to enable cross-references
between documentation sets. By default, missing third-party inventories (e.g., Pandas, SQLAlchemy)
produce warnings but do **not** fail the build — third-party servers can be temporarily unavailable.
If a cached version exists, it will be used with a warning.
Use ``--clean-inventory-cache`` to force a fresh download of all inventories, or
``--fail-on-missing-third-party-inventories`` to fail the build when any third-party inventory
is missing (useful for publishing). Note that ``--clean-build`` cleans build artifacts but
preserves the inventory cache.
Often errors during documentation generation come from the docstrings of auto-api generated classes.
During the docs building auto-api generated files are stored in the ``generated`` folder. This helps you
easily identify the location the problems with documentation originated from.
These are all available flags of ``build-docs`` command:
.. image:: ./images/output_build-docs.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_build-docs.svg
:width: 100%
:alt: Breeze build documentation
While you can use full name of doc package starting with ``apache-airflow-providers-`` in package filter,
You can use shorthand version - just take the remaining part and replace every ``dash("-")`` with
a ``dot(".")``.
Example:
If the provider name is ``apache-airflow-providers-cncf-kubernetes``, it will be ``cncf.kubernetes``.
Note: For building docs for apache-airflow-providers index, use ``apache-airflow-providers``
as the short hand operator.
Running static checks
---------------------
You can run static checks via prek.
For example, this following command:
.. code-block:: bash
prek mypy-airflow-core
will run mypy check for currently staged files inside ``airflow/`` excluding providers.
.. _breeze-dev:running-prek-in-breeze:
A note on running ``prek`` inside the Breeze container
-----------------------------------------------------
While ``prek`` (pre-commit) is intended to be run on your host machine, it can
also be run from within the Breeze shell for debugging or manual checks.
If you choose to do this, you may need to mount all sources by running
``breeze shell --mount-sources all``.
Selecting files to run static checks on
---------------------------------------
Prek hooks run by default on staged changes that you have locally changed. It will run it on all the
files you run ``git add`` on and it will ignore any changes that you have modified but not staged.
If you want to run it on all your modified files you should add them with ``git add`` command.
With ``--all-files`` you can run static checks on all files in the repository. This is useful when you
want to be sure they will not fail in CI, or when you just rebased your changes and want to
re-run latest prek hooks on your changes, but it can take a long time (few minutes) to wait for the result.
.. code-block:: bash
prek mypy-airflow-core --all-files
The above will run mypy check for all files.
You can limit that by selecting specific files you want to run static checks on. You can do that by
specifying (can be multiple times) ``--file`` flag.
.. code-block:: bash
prek mypy-airflow-core --file airflow/utils/code_utils.py --file airflow/utils/timeout.py
The above will run mypy check for those to files (note: autocomplete should work for the file selection).
However, often you do not remember files you modified and you want to run checks for files that belong
to specific commits you already have in your branch. You can use ``prek`` to run the checks
only on changed files you have already committed to your branch - either for specific commit, for last
commit, for all changes in your branch since you branched off from main or for specific range
of commits you choose.
.. code-block:: bash
prek mypy-airflow-core --last-commit
The above will run mypy check for all files in the last commit in your branch.
.. code-block:: bash
prek identity --verbose --from-ref HEAD^^^^ --to-ref HEAD
The above will run the check for the last 4 commits in your branch. You can use any ``commit-ish`` references
in ``--from-ref`` and ``--to-ref`` flags.
.. note::
When you run static checks, some of the artifacts (mypy_cache) is stored to speed up static
checks execution significantly:
- The providers ``mypy-providers`` hook runs via Breeze and stores its cache in the
``mypy-cache-volume`` docker-compose volume.
- Each non-provider ``mypy-*`` hook uses its own dedicated virtualenv and mypy cache under
``.build/mypy-venvs/<hook>/`` and ``.build/mypy-caches/<hook>/``; mypy itself is installed
from the workspace ``uv.lock`` via the ``mypy`` dependency group (``uv sync --group mypy``).
If the cache gets broken, run ``breeze down --cleanup-mypy-cache`` which wipes the docker
volume and every per-hook ``.build/mypy-venvs/`` and ``.build/mypy-caches/`` directory.
.. note::
You cannot change Python version for static checks that are run within Breeze containers.
The ``--python`` flag has no effect for them. They are always run with lowest supported Python version.
The main reason is to keep consistency in the results of static checks and to make sure that
our code is fine when running the lowest supported version.
Starting Airflow
----------------
For testing Airflow you often want to start multiple components (in multiple terminals). Breeze has
built-in ``start-airflow`` command that start breeze container, launches multiple terminals using tmux
and launches all Airflow necessary components in those terminals.
When you are starting Airflow from local sources, www asset compilation is automatically executed before.
.. code-block:: bash
breeze --python 3.10 --backend mysql start-airflow
You can also use it to start different executor.
.. code-block:: bash
breeze start-airflow --executor CeleryExecutor
You can also use it to start any released version of Airflow from ``PyPI`` with the
``--use-airflow-version`` flag - useful for testing and looking at issues raised for specific version.
.. code-block:: bash
breeze start-airflow --python 3.10 --backend mysql --use-airflow-version 2.7.0
When you are installing version from PyPI, it's also possible to specify extras that should be used
when installing Airflow - you can provide several extras separated by coma - for example to install
providers together with Airflow that you are installing. For example when you are using celery executor
in Airflow 2.7.0+ you need to add ``celery`` extra.
.. code-block:: bash
breeze start-airflow --use-airflow-version 2.7.0 --executor CeleryExecutor --airflow-extras celery
These are all available flags of ``start-airflow`` command:
.. image:: ./images/output_start-airflow.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_start-airflow.svg
:width: 100%
:alt: Breeze start-airflow
Running External System Integrations with Breeze
------------------------------------------------
You can run Airflow alongside external systems in Breeze, such as Kafka, Cassandra, MongoDB, and more.
To start Airflow with an integration, use the following command:
.. code-block:: bash
breeze --python 3.10 --backend postgres --integration <integration_name>
For example, to run Airflow with Kafka:
.. code-block:: bash
breeze --python 3.10 --backend postgres --integration kafka
Check the available integrations by running:
.. code-block:: bash
breeze --integration --help
Launching multiple terminals in the same environment
----------------------------------------------------
Often if you want to run full Airflow in the Breeze environment you need to launch multiple terminals and
run ``airflow api-server``, ``airflow scheduler``, ``airflow worker`` in separate terminals.
This can be achieved either via ``tmux`` or via exec-ing into the running container from the host. Tmux
is installed inside the container and you can launch it with ``tmux`` command. Tmux provides you with the
capability of creating multiple virtual terminals and multiplex between them. More about ``tmux`` can be
found at `tmux GitHub wiki page <https://github.com/tmux/tmux/wiki>`_ . Tmux has several useful shortcuts
that allow you to split the terminals, open new tabs etc - it's pretty useful to learn it.
Another way is to exec into Breeze terminal from the host's terminal. Often you can
have multiple terminals in the host (Linux/MacOS/WSL2 on Windows) and you can simply use those terminals
to enter the running container. It's as easy as launching ``breeze exec`` while you already started the
Breeze environment. You will be dropped into bash and environment variables will be read in the same
way as when you enter the environment. You can do it multiple times and open as many terminals as you need.
These are all available flags of ``exec`` command:
.. image:: ./images/output_exec.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_exec.svg
:width: 100%
:alt: Breeze exec
Breeze cleanup
--------------
Sometimes you need to cleanup your docker environment (and it is recommended you do that regularly). There
are several reasons why you might want to do that.
Breeze uses docker images heavily and those images are rebuild periodically and might leave dangling, unused
images in docker cache. This might cause extra disk usage. Also running various docker compose commands
(for example running tests with ``breeze testing core-tests``) might create additional docker networks that might
prevent new networks from being created. Those networks are not removed automatically by docker-compose.
Also Breeze uses its own cache to keep information about all images.
All those unused images, networks and cache can be removed by running ``breeze cleanup`` command. By default
it will not remove the most recent images that you might need to run breeze commands, but you
can also remove those breeze images to clean-up everything by adding ``--all`` command (note that you will
need to build the images again from scratch - pulling from the registry might take a while).
Breeze will ask you to confirm each step, unless you specify ``--answer yes`` flag.
These are all available flags of ``cleanup`` command:
.. image:: ./images/output_cleanup.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_cleanup.svg
:width: 100%
:alt: Breeze cleanup
Database and config volumes in Breeze
-------------------------------------
Breeze keeps data for all its integration, database, configuration in named docker volumes.
Those volumes are persisted until ``breeze down`` command. You can also preserve the volumes by adding
flag ``--preserve-volumes`` when you run the command. Then, next time when you start Breeze,
it will have the data pre-populated.
These are all available flags of ``down`` command:
.. image:: ./images/output_down.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_down.svg
:width: 100%
:alt: Breeze down
Running arbitrary commands in container
---------------------------------------
More sophisticated usages of the breeze shell is using the ``breeze shell`` command - it has more parameters
and you can also use it to execute arbitrary commands inside the container.
.. code-block:: bash
breeze shell "ls -la"
Those are all available flags of ``shell`` command:
.. image:: ./images/output_shell.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_shell.svg
:width: 100%
:alt: Breeze shell
Running commands without interactive shell
------------------------------------------
For automated testing, and one-off command execution, you can use the ``breeze run`` command
to execute commands in the Breeze environment without entering the interactive shell. This command is
particularly useful when you want to run a specific command and exit immediately, without the overhead
of an interactive session.
The ``breeze run`` command creates a fresh container that is automatically cleaned up after the command
completes, and each run uses a unique project name to avoid conflicts with other instances.
Here are some common examples:
Running a specific test:
.. code-block:: bash
breeze run pytest providers/google/tests/unit/google/cloud/operators/test_dataflow.py -v
Running Python commands:
.. code-block:: bash
breeze run python -c "from airflow.providers.google.version_compat import AIRFLOW_V_3_0_PLUS; print(AIRFLOW_V_3_0_PLUS)"
Running bash commands:
.. code-block:: bash
breeze run bash -c "cd /opt/airflow && python -m pytest providers/google/tests/"
Running with different Python version:
.. code-block:: bash
breeze run --python 3.11 pytest providers/standard/tests/unit/operators/test_bash.py
Running with PostgreSQL backend:
.. code-block:: bash
breeze run --backend postgres pytest providers/postgres/tests/
Those are all available flags of ``run`` command:
.. image:: ./images/output_run.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_run.svg
:width: 100%
:alt: Breeze run
Running Breeze with Metrics
---------------------------
Running Breeze with a StatsD Metrics Stack
..........................................
You can launch an instance of Breeze pre-configured to emit StatsD metrics using
``breeze start-airflow --integration statsd``. This will launch an Airflow API server
within the Breeze environment as well as containers running StatsD, Prometheus, and
Grafana. The integration configures the "Targets" in Prometheus, the "Datasources" in
Grafana, and includes a default dashboard in Grafana.
When you run Airflow Breeze with this integration, in addition to the standard ports
(See "Port Forwarding" below), the following are also automatically forwarded:
* 29102 -> forwarded to StatsD Exporter -> breeze-statsd-exporter:9102
* 29090 -> forwarded to Prometheus -> breeze-prometheus:9090
* 23000 -> forwarded to Grafana -> breeze-grafana:3000
You can connect to these ports/databases using:
* StatsD Metrics: http://127.0.0.1:29102/metrics
* Prometheus Targets: http://127.0.0.1:29090/targets
* Grafana Dashboards: http://127.0.0.1:23000/dashboards
Running Breeze with an OpenTelemetry Metrics Stack
..................................................
----
[Work in Progress]
NOTE: This will launch the stack as described below but Airflow integration is
still a Work in Progress. This should be considered experimental and likely to
change by the time Airflow fully supports emitting metrics via OpenTelemetry.
----
You can launch an instance of Breeze pre-configured to emit OpenTelemetry metrics
using ``breeze start-airflow --integration otel``. This will launch Airflow within
the Breeze environment as well as containers running OpenTelemetry-Collector,
Prometheus, and Grafana. The integration handles all configuration of the
"Targets" in Prometheus and the "Datasources" in Grafana, so it is ready to use.
When you run Airflow Breeze with this integration, in addition to the standard ports
(See "Port Forwarding" below), the following are also automatically forwarded:
* 28889 -> forwarded to OpenTelemetry Collector -> breeze-otel-collector:8889
* 29090 -> forwarded to Prometheus -> breeze-prometheus:9090
* 23000 -> forwarded to Grafana -> breeze-grafana:3000
You can connect to these ports using:
* OpenTelemetry Collector: http://127.0.0.1:28889/metrics
* Prometheus Targets: http://127.0.0.1:29090/targets
* Grafana Dashboards: http://127.0.0.1:23000/dashboards
Running Breeze with OpenLineage
...............................
You can launch an instance of Breeze pre-configured to emit OpenLineage metrics using
``breeze start-airflow --integration openlineage``. This will launch an Airflow API server
within the Breeze environment as well as containers running a [Marquez](https://marquezproject.ai/)
API server.
When you run Airflow Breeze with this integration, in addition to the standard ports
(See "Port Forwarding" below), the following are also automatically forwarded:
* MARQUEZ_API_HOST_PORT (default 25000) -> forwarded to Marquez API -> marquez:5000
* MARQUEZ_API_ADMIN_HOST_PORT (default 25001) -> forwarded to Marquez Admin API -> marquez:5001
* MARQUEZ_HOST_PORT (default 23100) -> forwarded to Marquez -> marquez_web:3000
You can connect to these services using:
* Marquez Webserver: http://127.0.0.1:23100
* Marquez API: http://127.0.0.1:25000/api/v1
* Marquez Admin API: http://127.0.0.1:25001
Make sure to substitute the port numbers if you have customized them via the above env vars.
Stopping the environment
------------------------
After starting up, the environment runs in the background and takes quite some memory which you might
want to free for other things you are running on your host.
You can always stop it via:
.. code-block:: bash
breeze down
These are all available flags of ``down`` command:
.. image:: ./images/output_down.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_down.svg
:width: 100%
:alt: Breeze down
Using local virtualenv environment in Your Host IDE
---------------------------------------------------
You can set up your host IDE (for example, IntelliJ's PyCharm/Idea) to work with Breeze
and benefit from all the features provided by your IDE, such as local and remote debugging,
language auto-completion, documentation support, etc.
To use your host IDE with Breeze:
1. Create a local virtual environment:
You can use any of the following wrappers to create and manage your virtual environments:
`pyenv <https://github.com/pyenv/pyenv>`_, `pyenv-virtualenv <https://github.com/pyenv/pyenv-virtualenv>`_,
or `virtualenvwrapper <https://virtualenvwrapper.readthedocs.io/en/latest/>`_.
2. Use the right command to activate the virtualenv (``workon`` if you use virtualenvwrapper or
``pyenv activate`` if you use pyenv.
3. Initialize the created local virtualenv:
.. code-block:: bash
./scripts/tools/initialize_virtualenv.py
.. warning::
Make sure that you use the right Python version in this command - matching the Python version you have
in your local virtualenv. If you don't, you will get strange conflicts.
4. Select the virtualenv you created as the project's default virtualenv in your IDE.
Note that you can also use the local virtualenv for Airflow development without Breeze.
This is a lightweight solution that has its own limitations.
More details on using the local virtualenv are available in the
`Local Virtualenv </contributing-docs/07_local_virtualenv.rst>`_.
Auto-generating migration files
-------------------------------
After making changes in the ORM models, you need to generate migration files. You can do this by running
the following command:
.. code-block:: bash
breeze generate-migration-file -m "Your migration message"
This command will generate a migration file in the ``airflow/migrations/versions`` directory.
These are all available flags of ``generate-migration-file`` command:
.. image:: ./images/output_generate-migration-file.svg
:target: https://raw.githubusercontent.com/apache/airflow/main/dev/breeze/images/output_generate-migration-file.svg
:width: 100%
:alt: Breeze generate-migration-file
------
Next step: Follow the `Troubleshooting <04_troubleshooting.rst>`_ guide to troubleshoot your Breeze environment.