TranscriptTool: Transcribe Audio/Video

TranscriptTool is a smolagent tool used to transcribe audio and video files into text. This tool allows agents to process multimedia inputs efficiently. Can be used within a smolagent via the Hugging Face API.