3

I follow this example

  1. create the example timetable py file, and put it in the $Home/airflow/plugins
  2. create the example dag file, and put it in $Home/airflow/dags

After restart scheduler and webserver, I get DAG import error. In the web UI, the last line of detailed error message:

airflow.exceptions.SerializationError: Failed to serialize DAG 'example_timetable_dag2': Timetable class 'AfterWorkdayTimetable' is not registered

But if I run airflow plugins, I can see the timetable is in the name and source list.

How to fix this error?

Detail of plugins/AfterWorkdayTimetable.py:

from datetime import timedelta
from typing import Optional

from pendulum import Date, DateTime, Time, timezone

from airflow.plugins_manager import AirflowPlugin
from airflow.timetables.base import DagRunInfo, DataInterval, TimeRestriction, Timetable

UTC = timezone("UTC")


class AfterWorkdayTimetable(Timetable):
    def infer_data_interval(self, run_after: DateTime) -> DataInterval:
        weekday = run_after.weekday()
        if weekday in (0, 6):  # Monday and Sunday -- interval is last Friday.
            days_since_friday = (run_after.weekday() - 4) % 7
            delta = timedelta(days=days_since_friday)
        else:  # Otherwise the interval is yesterday.
            delta = timedelta(days=1)
        start = DateTime.combine((run_after - delta).date(), Time.min).replace(tzinfo=UTC)
        return DataInterval(start=start, end=(start + timedelta(days=1)))
    def next_dagrun_info(
        self,
        *,
        last_automated_data_interval: Optional[DataInterval],
        restriction: TimeRestriction,
    ) -> Optional[DagRunInfo]:
        if last_automated_data_interval is not None:  # There was a previous run on the regular schedule.
            last_start = last_automated_data_interval.start
            last_start_weekday = last_start.weekday()
            if 0 <= last_start_weekday < 4:  # Last run on Monday through Thursday -- next is tomorrow.
                delta = timedelta(days=1)
            else:  # Last run on Friday -- skip to next Monday.
                delta = timedelta(days=(7 - last_start_weekday))
            next_start = DateTime.combine((last_start + delta).date(), Time.min).replace(tzinfo=UTC)
        else:  # This is the first ever run on the regular schedule.
            next_start = restriction.earliest
            if next_start is None:  # No start_date. Don't schedule.
                return None
            if not restriction.catchup:
                # If the DAG has catchup=False, today is the earliest to consider.
                next_start = max(next_start, DateTime.combine(Date.today(), Time.min).replace(tzinfo=UTC))
            elif next_start.time() != Time.min:
                # If earliest does not fall on midnight, skip to the next day.
                next_day = next_start.date() + timedelta(days=1)
                next_start = DateTime.combine(next_day, Time.min).replace(tzinfo=UTC)
            next_start_weekday = next_start.weekday()
            if next_start_weekday in (5, 6):  # If next start is in the weekend, go to next Monday.
                delta = timedelta(days=(7 - next_start_weekday))
                next_start = next_start + delta
        if restriction.latest is not None and next_start > restriction.latest:
            return None  # Over the DAG's scheduled end; don't schedule.
        return DagRunInfo.interval(start=next_start, end=(next_start + timedelta(days=1)))


class WorkdayTimetablePlugin(AirflowPlugin):
    name = "workday_timetable_plugin"
    timetables = [AfterWorkdayTimetable]

Details of dags/test_afterwork_timetable.py:

import datetime

from airflow import DAG
from AfterWorkdayTimetable import AfterWorkdayTimetable
from airflow.operators.dummy import DummyOperator


with DAG(
    dag_id="example_workday_timetable",
    start_date=datetime.datetime(2021, 1, 1),
    timetable=AfterWorkdayTimetable(),
    tags=["example", "timetable"],
) as dag:
    DummyOperator(task_id="run_this")

If I run airflow plugins:

name                              | source                                   
==================================+==========================================
workday_timetable_plugin          | $PLUGINS_FOLDER/AfterWorkdayTimetable.py       
tfull
  • 31
  • 1
  • 5
  • it should be officially fixed in [PR](https://github.com/apache/airflow/pull/19878) – tfull Apr 01 '22 at 06:59
  • did you find any solution? I have similar issue with mwaa 2.2.2 version. with Normal Airflow same code works but it does not work with aws managed airflow. – Rahul Patel May 10 '22 at 13:40
  • This fix is merge into Airflow 2.2.3. Maybe you could try upgrade the version to newer than 2.2.3? – tfull May 12 '22 at 11:36

5 Answers5

2

I had similar issue.

Either you need to add __init__.py file or you should try this to debug your issue:

Get all plugin manager objects:

    from airflow import plugins_manager
    plugins_manager.initialize_timetables_plugins()
    plugins_manager.timetable_classes

I got this result: {'quarterly.QuarterlyTimetable': <class 'quarterly.QuarterlyTimetable'>}

Compare your result with exception message. If timetable_classes dictionary has a different plugin name you should either change plugin file path.

You could also try this inside DAG python file:

from AfterWorkdayTimetable import AfterWorkdayTimetable
from airflow import plugins_manager
print(plugins_manager.as_importable_string(AfterWorkdayTimetable))

This would help you find the name that airflow tries to use when searching through timetable_classes dictionary.

Bakuchi
  • 21
  • 3
  • Unfortunately the names are exactly the same in my case but still have the issue that the Timetable class is not registered. Using airflow 2.2.3 with python 3.9.9. Output: `print (plugins_manager.timetable_classes) {'holiday_calendar_timetable.HolidayCalendarTimetable': }` while the exception shows: ` raise _TimetableNotRegistered(importable_string) airflow.serialization.serialized_objects._TimetableNotRegistered: Timetable class 'holiday_calendar_timetable.HolidayCalendarTimetable' is not registered` – Paul Fennema Feb 04 '22 at 11:38
  • What is even more annoying is that when running airflow in standalone mode and not on kubernetes it is working fine actually. – Paul Fennema Feb 04 '22 at 11:40
  • and I'm using the official plugins directory for the timetable plugin so there should not be any discrepancy between what airflow thinks where it should be and where python finds it. – Paul Fennema Feb 04 '22 at 11:49
  • 1
    Found the issue with kubernetes: When using plugins in the kubernetes environment, make sure that you add the plugins directory as extraVolumes also for the workers and webserver as otherwise when trying to execute the python code the plugin module will not be found. – Paul Fennema Feb 04 '22 at 15:21
1

You need to register the timetable in "timetables" array via plugin interface. See:

https://airflow.apache.org/docs/apache-airflow/stable/plugins.html

Jarek Potiuk
  • 19,317
  • 2
  • 60
  • 61
  • 1
    Thanks for your reply. It seems the official example already has: ```class WorkdayTimetablePlugin(AirflowPlugin): name = "workday_timetable_plugin" timetables = [AfterWorkdayTimetable] ```. Do I need to register it in another place? – tfull Oct 28 '21 at 01:54
1

Encountered the same issue. These are the steps I followed.

  1. Add the Timetable file(custom_tt.py) into plugins folder.

  2. Make sure the plugin folder has _ _ init_ _.py file present in plugins folder.

  3. Change the lazy_load_plugins in airflow.cfg to False. lazy_load_plugins = False

  4. Add import statement in dagfile as: from custom_tt import CustomTimeTable

  5. In DAG as DAG(timetable=CustomTimeTable())

  6. Restart the webserver and scheduler. Problem fixed.

Rajeev S
  • 11
  • 1
  • 1
0

They have found the resolution to this but doesn't seem like they have updated the documentation to represent the fix just yet.

Your function

def infer_data_interval(self, run_after: DateTime) -> DataInterval:

should be

def infer_manual_data_interval(self, *, run_after: DateTime) -> DataInterval:

See reference: Apache airflow.timetables.base Documentation

After updating the function with the correct name and extra parameter, everything else should work for you as it did for me.

0

I was running into this as well. airflow plugins reports that the plugin is registered, and running the DAG script on the command line works fine, but the web UI reports that the plugin is not registered. @Bakuchi's answer pointed me in the right direction.

In my case, the problem was how I was importing the Timetable - airflow apparently expects you to import it relative to the $PLUGINS_FOLDER, not from any other directory, even if that other directory is also on the PYTHONPATH.

For a concrete example:

export PYTHONPATH=/path/to/my/code:$PYTHONPATH

# airflow.cfg
plugins_folder = /path/to/my/code/airflow_plugins

# dag.py
import sys
from airflow_plugins.custom_timetable import CustomTimetable as Bad
from custom_timetable import CustomTimetable as Good
from airflow import plugins_manager

plugins_manager.initialize_timetables_plugins()
print(sys.path)                                    # /path/to/my/code:...:/path/to/my/code/airflow_plugins
print(plugins_manager.as_importable_string(Bad))   # airflow_plugins.custom_timetable.CustomTimetable
print(plugins_manager.as_importable_string(Good))  # custom_timetable.CustomTimetable
print(plugins_manager.timetable_classes)           # {'custom_timetable.CustomTimetable': <class 'custom_timetable.CustomTimetable'>}

A bad lookup in plugins_manager.timetable_classes is ultimately what ends up raising the _TimetableNotRegistered error, so the fix is to make the keys match by changing how the timetable is imported.

I submitted a bug report: https://github.com/apache/airflow/issues/21259

0x5453
  • 12,753
  • 1
  • 32
  • 61
  • Have you been able to import when the plugin file was in a subdirectory of /plugins? – Erik Mar 24 '22 at 19:02