Matt is a Python library for UI automation and testing. It serves as a robust wrapper around pyautogui, providing enhanced functionality for image-based element lookup, region caching for performance optimization, OCR capabilities, and a simplified API for interacting with UI elements.
- Image-based UI Lookup: Define UI elements using image files.
- Performance Optimization: Caches the last known location of UI elements to speed up subsequent searches.
- Resilience: specific methods to wait for elements to appear.
- OCR Support: Built-in optical character recognition using
pytesseract. - Simplified API: Easy-to-use methods for clicking, typing, and moving the mouse.
Import Matt as a git submodule:
git submodule add https://github.com/Michal-Mikolas/matt.gitfrom matt.matt import Matt
# Initialize Matt
matt = Matt(
cache_file='cache/matt.json'
)
# Define UI elements (name -> image path)
matt.set_ui({
'start_button': 'images/start.png',
'settings_icon': ['images/settings.png', 'images/settings_hover.png'], # Can provide multiple images for one element
})
# Interact with the UI
matt.click('start_button')
matt.wait('settings_icon')
matt.click('settings_icon')Sets the dictionary of UI elements.
matt.set_ui({
'submit_btn': 'assets/submit.png',
'cancel_btn': 'assets/cancel.png'
})Waits for a UI element to appear on the screen. Returns the center coordinates.
# Wait for the submit button to appear
pos = matt.wait('submit_btn', timeout=10)
print(f"Button found at: {pos}")Waits for any one of the specified UI elements to appear. Useful for handling conditional popups or different states.
# Wait for either 'login_success' or 'login_error'
element, pos = matt.which('login_success', 'login_error')
if element == 'login_success':
print("Logged in successfully!")
else:
print("Login failed.")Clicks on a UI element. If ui is None, clicks at the current mouse position.
# Click the submit button
matt.click('submit_btn')
# Click at the current position
matt.click()
# Click 10 pixels to the right of the center of the element
matt.click('submit_btn', x=10)Double-clicks on a UI element.
matt.double_click('desktop_icon')Right-clicks on a UI element.
matt.right_click('item_row')Moves the mouse cursor to a UI element.
matt.move_to('hover_menu')Presses a hotkey combination.
matt.hotkey('ctrl', 'c')
matt.hotkey('alt', 'tab')Types a message string.
matt.typewrite('Hello World!', interval=0.1)Holds or releases the mouse button.
matt.move_to('drag_start')
matt.mouse_down()
matt.move_to('drag_end')
matt.mouse_up()Takes a screenshot.
# Save full screen screenshot
matt.screenshot('myscreen.png')
# Get screenshot object for a specific region
img = matt.screenshot(region=(0, 0, 300, 400))Performs OCR on the screen or a specific region. Returns text with non-digits removed (based on current implementation).
# Read numbers from a specific region
text = matt.ocr(region=(100, 100, 200, 50))
print(text)Selects a region by dragging the mouse from top-left to bottom-right of the region.
# Select a 100x100 square starting at 50,50
matt.select((50, 50, 100, 100))Performs a copy operation (Ctrl+C) and returns the clipboard content.
# Select text then copy it
matt.select((100, 100, 200, 20))
text = matt.copy()
print(text)