Skip to content

How to implement a gemm with FP16 and INT4 using kernel in FasterTransformer/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm #794

@AkatsukiChiri

Description

@AkatsukiChiri

I am trying to implement a GEMM with FP16 and INT4. I hope to call the fpA_intB_gemm_fp16_int4 kernel located in FasterTransformer/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm, but I see that the examples are all implementations for model inference. If I only want to reproduce the GEMM kernel, what should I do?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions