Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report - Unable to convert model to CoreML or to C #450

Closed
ephemer opened this issue May 6, 2024 · 2 comments
Closed

Bug report - Unable to convert model to CoreML or to C #450

ephemer opened this issue May 6, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@ephemer
Copy link

ephemer commented May 6, 2024

馃悰 Bug

It's not possible to convert the Silero VAD model to work with CoreML or with other conversion tools.

To Reproduce

Steps to reproduce the behavior:

Create the following script in the root directory of this repo and pip install coremltools. Then run:

import torch
import utils_vad # from this repo
import coremltools as ct
import numpy as np


model = utils_vad.init_jit_model("files/silero_vad.jit")
model.eval()

input_features = [
    ct.TensorType(name="audio", shape=torch.Size([512])),
    ct.TensorType(name="sampling_rate", shape=ct.Shape((1,)), dtype=np.int64),
]
output_features = [ct.TensorType(name="output")]

coreml_model = ct.convert(
    model,
    inputs=input_features,
    outputs=output_features,
    minimum_deployment_target=ct.target.iOS15,
    skip_model_load=True,
)

There are too many errors to list. I have tried to go through and comment out places where Exceptions are raised to try to get to the bottom of it, but I wasn't able to get even a broken output:

  • .name() does not exist on the Tensor C type when determining whether this is a quantized model
  • There are shape inconsistencies between different condition branches ([1,1,512] vs [1, 512])
  • And many more

Expected behavior

What I'm really trying to do is find a way to include silero-vad in a mobile app without having to bundle ONNX. I wasn't able to convert the .jit model to onnx myself either (I thought maybe I'd have more luck converting the resulting model to another format if it worked). I also attempted to use this tool to convert the onnx model to C but it also fails because If is not implemented there.

Environment

python collect_env.py
Collecting environment information...
PyTorch version: 2.3.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.4.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.28.3
Libc version: N/A

Python version: 3.12.3 | packaged by Anaconda, Inc. | (main, Apr 19 2024, 11:44:52) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-14.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M2 Max

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.3.0 # it fails just the same with 2.0.0 and 2.1.0 though
[pip3] torchaudio==2.3.0
[pip3] torchvision==0.18.0
[conda] numpy                     1.26.4          py312h7f4fdc5_0
[conda] numpy-base                1.26.4          py312he047099_0
[conda] pytorch                   2.3.0                  py3.12_0    pytorch
[conda] torchaudio                2.3.0                 py312_cpu    pytorch
[conda] torchvision               0.18.0                py312_cpu    pytorch

Additional context

It would be really helpful to be able to modify the original Silero PyTorch model, for example to remove branching, implement the feature extractor in C directly, and so on. I'm curious whether you have considered that possibility for distribution of upcoming versions?

@ephemer ephemer added the bug Something isn't working label May 6, 2024
@IntendedConsequence
Copy link

@ephemer I just finished writing a C implementation of the v3.1 16kHz model. I'm working on it as a personal learning project, and it's very much in proof of concept stage. It probably won't build nor run anywhere but my machine atm. Having said that, if you don't mind the jank, look in my vadc repo, branch c_port_continued, this function tests the full model implementation https://github.com/IntendedConsequence/vadc/blob/b5c25db328a5fbee27a421a2d892de42bbaa3dd5/test.c#L1424

@ephemer
Copy link
Author

ephemer commented May 7, 2024

@IntendedConsequence thanks for sharing, that's really interesting work! 馃檹馃徏

Repository owner locked and limited conversation to collaborators May 29, 2024
@snakers4 snakers4 converted this issue into discussion #456 May 29, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants