Skip to content

Abstract Downloader

Abstract Base Class for Paper Downloaders.

This module defines the AbstractPaperDownloader class, which serves as a base class for downloading scholarly papers from different sources (e.g., arXiv, PubMed, IEEE Xplore). Any specific downloader should inherit from this class and implement its methods.

AbstractPaperDownloader

Bases: ABC

Abstract base class for scholarly paper downloaders.

This is designed to be extended for different paper sources like arXiv, PubMed, IEEE Xplore, etc. Each implementation must define methods for fetching metadata and downloading PDFs.

Source code in aiagents4pharma/talk2scholars/tools/paper_download/abstract_downloader.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class AbstractPaperDownloader(ABC):
    """
    Abstract base class for scholarly paper downloaders.

    This is designed to be extended for different paper sources
    like arXiv, PubMed, IEEE Xplore, etc. Each implementation
    must define methods for fetching metadata and downloading PDFs.
    """

    @abstractmethod
    def fetch_metadata(self, paper_id: str) -> Dict[str, Any]:
        """
        Fetch metadata for a given paper ID.

        Args:
            paper_id (str): The unique identifier for the paper.

        Returns:
            Dict[str, Any]: The metadata dictionary (format depends on the data source).
        """

    @abstractmethod
    def download_pdf(self, paper_id: str) -> bytes:
        """
        Download the PDF for a given paper ID.

        Args:
            paper_id (str): The unique identifier for the paper.

        Returns:
            bytes: The binary content of the downloaded PDF.
        """

download_pdf(paper_id) abstractmethod

Download the PDF for a given paper ID.

Parameters:

Name Type Description Default
paper_id str

The unique identifier for the paper.

required

Returns:

Name Type Description
bytes bytes

The binary content of the downloaded PDF.

Source code in aiagents4pharma/talk2scholars/tools/paper_download/abstract_downloader.py
35
36
37
38
39
40
41
42
43
44
45
@abstractmethod
def download_pdf(self, paper_id: str) -> bytes:
    """
    Download the PDF for a given paper ID.

    Args:
        paper_id (str): The unique identifier for the paper.

    Returns:
        bytes: The binary content of the downloaded PDF.
    """

fetch_metadata(paper_id) abstractmethod

Fetch metadata for a given paper ID.

Parameters:

Name Type Description Default
paper_id str

The unique identifier for the paper.

required

Returns:

Type Description
Dict[str, Any]

Dict[str, Any]: The metadata dictionary (format depends on the data source).

Source code in aiagents4pharma/talk2scholars/tools/paper_download/abstract_downloader.py
23
24
25
26
27
28
29
30
31
32
33
@abstractmethod
def fetch_metadata(self, paper_id: str) -> Dict[str, Any]:
    """
    Fetch metadata for a given paper ID.

    Args:
        paper_id (str): The unique identifier for the paper.

    Returns:
        Dict[str, Any]: The metadata dictionary (format depends on the data source).
    """